Hendri Karisma

← Web Docs← Dokumentasi

Measuring Software Carbon Intensity (SCI) Across Modern Backend Frameworks: Implementing Green Coding Strategies in Enterprise EcosystemsMeasuring Software Carbon Intensity (SCI) Across Modern Backend Frameworks: Implementing Green Coding Strategies in Enterprise Ecosystems

ResearchPenelitian

::: making things abstract This research does an extensive assessment and juxtaposition of Software Carbon Intensity (SCI) for seven modern backend frameworks: Java Spring Boot, Java Quarkus, Go Gin, Rust Axum, and Python FastAPI JavaScript Express and C# .NET. In the time of huge digital The choice of backend technology has an effect on more than only the functioning of the application and the digital carbon footprint that comes from it. We use a controlled experimental method to detect energy The amount of energy used, the time it takes to respond, the amount of memory used, and the amount of carbon dioxide released by each framework while doing the same CRUD activities. The results show big variations in how energy-efficient: compiled native languages Compared to interpreted languages, Rust and Go are 40–60% more efficient. languages like Python and JavaScript, as well as VM-based languages with JIT Compilers like Java and C# are in the middle. This study offers empirical advise based on evidence for industry professionals that want to embrace Green Coding methods that work to lessen the environmental impact of systems for businesses. :::

Introduction

Background and Reason for Doing It

Climate change has become one of the most important problems for people to solve. in the 21st century, thanks to the Information and Communication Technology The ICT sector is becoming more and more important. Estimates right now show that the ICT industry makes up about 1.8% to 3.9% of a number that is similar to the aviation industry's greenhouse gas emissions. business [@Freitag2021]. This contribution is expected to becoming bigger. a lot as digital transformation speeds up in all areas of the world's economy.

The rapid expansion of cloud computing and microservices architectures and API-driven apps have made backend services more in demand than ever. Services. Data centers all throughout the world work around the clock to service these services, using a lot of electricity and making a lot of carbon emissions. In this situation, the decision of programming Languages and frameworks for backend development become more than just a a technological choice, but also a necessity for the environment [@Koenig2020; @Wasif2024].

Green Software Engineering (GSE) has become an important field to make software systems that use less energy and sustainable for the environment [@Ardito2015]. A key measure in GSE is Software Carbon Intensity (SCI), which measures the carbon the amount of emissions produced for each functional unit of software operation. Even though Earlier studies have investigated algorithmic optimization for energy. efficiency within individual programming languages (GSE Type 1 at the algorithmic level) [@Karisma2026; @Lannelongue2021], thorough comparison research analyzing the impact of various backend frameworks on There are still not many SCI values in real business workloads. This research applies GSE Type 1 concepts to the level of framework selection, assessing the natural differences in efficiency between seven modern technology stacks running the same business logic.

The Problem of Choosing Technology

Modern businesses commonly use backend technologies based on mainly based on things like how well the developer knows the ecosystem, or how popular it is in the community, without thinking about the long-term effects on the environment [@GreenAdoption2024]. This method doesn't take into account the big disparities in how much energy different programming paradigms use paradigms:

  • Interpreted Languages: Python and JavaScript (Node.js) are the most popular. current development because of the ability to quickly prototype and lots of books. But their runtime interpretation overhead frequently leads to greater energy usage than compiled other options [@Pereira2017; @PythonMicroscope2025; @vanKempen2024].

  • Compiled Native Languages: Languages like Rust and Go that are based on come from the C/C++ family and compile directly to machine code with minimal overhead at runtime. Rust's memory management based on ownership supposedly eliminates the need for waste pickup, offering better energy efficiency [@RustOwnership2023].

  • Virtual Machine (VM) Based Languages: Java and C# run on advanced virtual machines (JVM and CLR, respectively) that use Compilation that happens just in time (JIT). These platforms are portable. and ecosystems that are already well-developed, but they add memory management overhead. different methods for collecting waste [@JavaJIT2024; @DotNetCLR2024].

What We Want to Find Out and Why

Even if more people are learning about the fundamentals of Green Software Engineering, There are still some important knowledge gaps:

  1. Comprehensive Framework Comparison: The majority of current studies concentrate on comparisons at the language level using benchmark programs instead of realistic workloads for applications [@Pereira2017; @ProgrammingRank2024].

  2. Per-Endpoint Analysis: There is not much study on energy consumption at the level of individual API endpoints (CRUD operations), which is necessary for finding ways to improve chances.

  3. Positive vs. Negative Cases: The effect of error management on energy and validation (negative test cases) in contrast to successful operations The exploration of good cases remains unaddressed.

  4. Effect on Startup Time: The energy use of an application when it first starts up, important for serverless and containerized deployments, is not often quantified.

This research fills these voids by doing actual measurements of SCI across seven contemporary backend frameworks exemplifying diverse paradigms for runtime. We not only look at total energy use but also Also, metrics per request, startup overhead, and profiling at the function level to find places where energy is high.

Contributions to Research

This study adds to the field of Green in the following ways: Engineering for Software:

    1. Comprehensive Empirical Data: Thorough assessments of energy memory utilization, carbon emissions, response time, and power usage spans seven production-ready backend frameworks that all do the same thing business rules.
  1. Per-Endpoint Granularity: A detailed look into CRUD activities like Create, Read, Update, and Delete for more than one domain entities (Users, Addresses, Carbon Credits, Trees Donations).

  2. Positive and Negative Case Analysis: Energy comparison utilization between successful operations and error handling situations.

  3. Practical Recommendations: Software advice based on data architects choosing technology for businesses that will last systems.

    1. Open Dataset: All of the experimental data, measurement scripts, and analyzing tools that the research community can use to check their work and an extension.

The rest of this paper is set up like this: Section 2 shows theoretical foundation of Green Software Engineering and programming language traits; Section 3 goes into more information about our experiment technique; Section 4 shows full results; Section 5 talks about the results and their effects; and Section 6 ends with future Directions for research.

Theoretical Background and Related Work

Principles of Green Software Engineering

Green Software Engineering is a discipline that deals with the important necessity to lower the environmental impact of computer systems over their full development and operational life cycle [@Ardito2015; @Koenig2020]. This new field is based on three basic ideas that are connected:

    1. Energy Efficiency: Using less electricity needed for software to work, which cuts down on carbon dioxide pollution that comes from using electricity while the computer is running.
  1. Hardware Efficiency: Making the most of the hardware you already have computational infrastructure with better software design, It lowers the need for more physical hardware deployment.

  2. Carbon Awareness: Knowing about grid electricity carbon intensity into decisions about how to schedule computations, which Workloads can run when renewable energy sources are in charge of the power blend.

The basic truth about computing is that every command to the processor uses electricity, and this use of electricity means carbon emissions depend on how the electricity was made [@Pathania2024]. So, making software that works well that lessens the number of processing cycles, cuts down on memory allocation overhead, and optimizes input/output activities is a direct approach to lessening the influence on the environment.

Green Software Engineering Taxonomy: GSE Type 1 and Type 2

Recent academic research has developed a conceptual taxonomy for putting software's relationship with environmental sustainability into groups, differentiating between two essential methodologies for green computing [@Karisma2026]:

  • GSE kind 1 (Green in Software Engineering): This is the first kind talks about how efficient software systems are on their own, with a focus on the lessening of the use of computing resources through improvement of architecture and algorithms. The main goal focuses on improving performance traits that directly translate to less energy use and a smaller carbon imprint. Algorithmic complexity is one example of an intervention. cutting down (for example, $O(n^2) \rightarrow O(n \log n)$), inquiry improving database systems, reducing memory usage, and increases to the efficiency of the execution path.

  • GSE Type 2 (Green by Software Engineering): This is a type of covers programs that were made just to help and keep an eye on sustainability efforts in both businesses and society as a whole. Instead of Instead of making the software use its own resources more efficiently, these systems work as tools to help the environment goals like carbon neutrality, following ESG rules, and climate action. Emission monitoring platforms are common examples. methods for managing the grid for renewable energy, sustainability metrics dashboards and methods for keeping track of carbon emissions.

These two groups are different but work well together: Type 1 focuses on the direct environmental costs of running software. Type 2, on the other hand, uses software features to make and speed up efforts to make the outside world more sustainable.

The current study focuses solely on GSE Type 1 issues, examining the impact of technology stack selection—particularly the The architecture of the programming language runtime and web framework has an effect on the basic energy efficiency of a backend that works the same way systems. Previous Type 1 research investigates algorithmic optimization. prospects inside a single linguistic environment [@Karisma2026], our contribution provide cross-language energy measures that can be compared at the framework layer, which helps businesses make decisions about their architecture creating applications.

SCI, or Software Carbon Intensity

The Green Software Foundation came up with the Software Carbon Intensity metric. Foundation offers a defined method for measuring software. carbon emissions. The basic SCI formula is:

$$SCI = \frac{(E \times I) + M}{R} \label{eq:sci}$$

Where:

– $E$ = the amount of energy the software system uses (kWh)

  • $I$ is the carbon intensity of the power grid in the area (gCO2eq/kWh)

  • $M$ = the amount of carbon emissions that are built into making hardware

– $R$ = functional unit (for example, per API request or per user session)

This study largely examines operational efficiency. The embodied carbon $M$ stays the same across all of them, therefore $(E \times I)$ is the same. experiments utilizing same hardware infrastructure. The carbon in the area The global average for intensity $I$ is 475 gCO2eq/kWh. from the source material [@Lannelongue2021].

Energy Efficiency of Programming Languages

Recent research has thoroughly examined energy use across languages for programming. Pereira et al. [@Pereira2017] did groundbreaking research that looks at 27 programming languages across 10 benchmarks difficulties, discovering that compiled languages such as C, C++, and Rust consistently did better than interpreted languages by a range of 2x to 75x when it comes to how much energy it uses.

More recent research has given us more information:

  • Language Rankings: New rankings based on energy readings confirm that compiled native languages (C, C++, Rust, Go) are at the top for energy efficiency, while interpreted languages are the best for energy efficiency, while translated languages (Python, Ruby, Perl) use a lot more energy. [@ProgrammingRank2024].

  • Python-Specific Analysis: A close look at Python Execution methods show that the CPython interpreter adds overhead. adds a lot to energy use, together with PyPy (JIT) and Cython (compiled) giving enhancements [@PythonMicroscope2025].

Green AI Perspective: Looking at machine learning workloads shows that choosing the right language has a big effect on the The carbon burden of training and using AI models [@Marini2024].

Features of the Backend Framework

This study analyzes seven frameworks that embody three unique runtime environments. types of architecture:

Compiled Native Languages

Go (Gin Framework): Google made Go, which is a statically typed language. a compiled language made for writing programs that run at the same time. Important Some of its traits are:

  • Lineage: Comes from the C family and is compiled into native machine code.

  • Memory Management: A garbage collector that marks and sweeps at the same time optimized for reduced latency [@GoGC2023]

  • Concurrency: Lightweight goroutines make it easy to handle requests at the same time

  • Performance Profile: Starts up quickly and uses little memory. performance that can be counted on

Rust (Axum Framework): Rust focuses on memory safety without garbage collecting through its ownership system [@RustOwnership2023]. Some of the traits are:

  • Lineage: A programming language for systems that was influenced by C++. compiled into native code

  • Memory Management: Checking ownership at compile time gets rid of GC overhead at runtime

  • Safety Guarantees: Stops frequent memory mistakes like null pointers, data races at compile time

  • Performance Profile: Similar to C/C++, no cost abstractions, low runtime

JIT Compilation in a Virtual Machine

Java (Spring Boot and Quarkus): The Java Virtual Machine runs Java. Machine (JVM) with advanced Just-In-Time compilation:

  • Lineage: Object-oriented language with a mature environment (25+ years)

  • Memory Management: Generational trash collection with more than one algorithms for collecting [@JavaJIT2024]

  • JIT Compilation: The HotSpot compiler makes code that is run often run faster. code routes when the program is running

  • Spring Boot: A traditional JVM framework with a lot of features set of features

  • Quarkus: A modern framework that works best with containers and supports native compilation with GraalVM

C# (.NET): Runs on the Common Language Runtime (CLR) with sophisticated JIT compilation:

  • Lineage: Made to blend the speed of C++ with the look and feel of Java memory that is managed

  • Memory Management: Generational GC for both workstations and servers modes [@DotNetCLR2024]

  • JIT Compilation: The RyuJIT compiler has layered compilation for best performance

  • Performance Profile: Works well with Java and has a contemporary runtime architecture

Interpreted/JIT Hybrid Languages

Python (FastAPI): Python is a dynamically typed language that is mostly understood:

  • Lineage: A high-level language that puts developer efficiency first

  • Execution Model: CPython interpreter with a little amount of JIT improvement

  • Memory Management: Cycle-detecting GC with reference counting

  • Performance Profile: slower execution but faster development [@PythonMicroscope2025]

JavaScript (Express.js): The V8 engine runs it with advanced JIT compilation:

  • Lineage: It started as browser scripting and is now everywhere. Node.js

  • Execution Model: the TurboFan JIT compiler and the V8 engine

  • Memory Management: collecting junk from different generations

  • Event Loop: A single-threaded, event-driven architecture for input and output operations

Related Work on Measuring Energy

There are a few different ways to measure how much energy software uses:

  • Hardware Instrumentation: Using power meters and other specialized tools Sensors give precise readings, but they need to be physically accessible. to infrastructure [@Chowdhury2018].

  • RAPL Interface: Intel's Running Average Power Limit (RAPL) gives the CPU package's power use using model-specific registers, giving processor-bound workloads a lot of accuracy.

  • Software Estimation Models: CodeCarbon and other tools like it GreenScaler figures out how much energy a device uses based on its hardware. utilization metrics [@CodeCarbon; @Chowdhury2018].

Recent research has emphasized the significance of evaluating the full The energy footprint of the software development lifecycle [@Rahman2022] and working together to manage green business processes [Jakobi2016].

Methodology

This study utilizes a controlled experimental approach to guarantee equity. comparison of seven technological stacks. Each framework runs the same business logic and the same functional needs.

Design of the Experiment

Our experimental design employs a methodical framework for measurement and compare the energy efficiency of seven backend frameworks under under regulated settings. The investigation utilizes a between-subjects design. When each framework signifies a distinct treatment group, removing biases and learning effects that could happen in within-subjects designs.

The experimental setup guarantees internal validity by rigorous control of confounding variables: all frameworks work with the same hardware, do the same work, and talk to the same database setting up. Implementing a external validity is how you deal with it. a realistic e-commerce app with business logic that makes sense (user management, handling addresses, carbon credit transactions, and tree tracking contributions), which makes the results applicable to comparable systems for making things.

To make sure that each test case is reliable, it is run several times. in a variety of execution settings (both positive and bad), Statistical approaches are used to combine the measurements. The The experimental approach is completely mechanized and recorded, making it possible for full reproducibility by researchers who aren't affiliated with the original study.

The research design adheres to the Goal-Question-Metric (GQM) framework. [@Basili1994GQM]: Our goal is to assess energy efficiency; our Questions ask which framework uses the least energy, works the best, and produces the least carbon emissions. performance, and lowest carbon emissions; our metrics include CPU, memory utilization (MB), reaction time (ms), and energy use (kWh) Use (%), carbon emissions (gCO2eq), and startup time (seconds).

Variables for Research

Table 1{reference-type="ref reference="tab:research-variables"} gives a full picture of all the research factors used in this study. The matrix sorts 19 variables into three groups: independent variables (7 technology stacks), dependent variables (six measures of performance and sustainability), and control variables (6 environmental constants), with matching operational definitions and ways to measure each one.

::: {#tab:research-variables} Type Variable Operational Definition Measurement


Independent Variables (Choosing a Technology Stack)
IV-1 Java Spring Boot 3.2: A traditional JVM framework with a lot of capabilities. Framework selection IV-2 Java Quarkus 3.6: A cloud-native JVM framework that works best in containers. IV-3 Go Gin 1.9 Compiled native with HTTP web framework Framework selection IV-4 Rust Axum 0.7 Compiled native with async web framework Framework selection IV-5 Python FastAPI 0.104: An asynchronous ASGI framework for Python 3.12. Framework selection IV-6 JavaScript Express 4.18 Node.js web application framework Framework selection IV-7 C# .NET 6.0 CLR-based framework with RyuJIT compiler Framework selection Dependent Variables (Metrics for Performance and Sustainability)
DV-1 Energy Usage: The amount of electrical power used during execution (kWh) via Intel RAPL MSR registers DV-2 Response Time Time between request and response (ms) Application timestamps DV-3 Memory Usage Peak and average RAM utilization (MB) Docker Stats API DV-4 CPU Utilization Percentage of CPU cycles used (%) Docker Stats API DV-5 Carbon Emissions: CO2eq emissions from energy (gCO2eq) = E × 475 gCO2/kWh DV-6: Startup Time: The time it takes for a container to start and pass a health check (s) and the time stamps for the containers Control Variables (Constants in the Environment)
CV-1 Database System: PostgreSQL 15.0 with the same schema and a fixed configuration CV-2 Hardware: Intel Core i7-1270P (12 cores, max 4.8 GHz), RAPL-enabled Physical machine OS Environment: Ubuntu 22.04 LTS (Jammy Jellyfish) System environment Container Runtime: Docker 24.0.6 with containerd Containerization CV-5 Resource Limits: 2 CPU cores, 2GB of RAM, and no swap. These are Docker limits. CV-6 Network Localhost bridge (no external latency) Docker networking

: Matrix of Research Variables :::

How to Build an Application

The following is the same RESTful API for each framework: Features:

  • Domain Model: There are four entities: User, Address, CarbonCredit, and TreeContribution) that have relationships

  • Business Logic: CRUD operations, validation, and authentication searches for aggregation

  • Database Operations: Connection pooling, prepared statements, managing transactions

  • Validation: the format of Indonesian phone numbers and postal codes validation, email validation

  • Security: BCrypt password hashing and cleaning up input

All implementations follow SOLID principles and best practices for their their own ecosystems.

Infrastructure for Measurement

Table 2{reference-type="ref" reference="tab:measurement-tools"} talks about the equipment used to measure and their particular roles used in this research. The infrastructure combines Intel RAPL for monitoring energy at the hardware level with containerization metrics (Docker Stats), configurable performance logging, and stack-specific profiling tools to get a full picture of performance and sustainability information.

:: {#tab:measurement-tools} Category Tool Function


Energy RAPL CPU package power use Memory Docker Stats Container memory consumption Performance, Custom Logger, and measuring response time Load Generation, Custom Script, and Concurrent Request Simulation Profiling: Function-level analysis for certain stacks

: Tools and Functions for Measurement ::

How the Experiment Worked

The experimental approach adheres to a structured six-phase workflow. made to make sure that all measurements are the same and can be repeated sets of technology. All frameworks go through the same tests. with automated orchestration removing the need for human control, human bias and changes in time.

The first step in the workflow is to check the environment to make sure RAPL energy access to measurement, availability of Docker infrastructure, and database connectivity. After successful validation, each framework's The container image is produced and deployed with standardized resources. limits (2 CPU cores and 2GB of RAM). A baseline after deployment The measurement phase sets the amount of idle resources that are being used, which gives reference points for figuring out the resource deltas for each request.

The core testing phase runs all 15 CRUD endpoints in a planned way. provides both good and bad test cases. Energy usage is recorded in microseconds using Intel RAPL registers, while Response times, memory consumption, and CPU usage are recorded through Docker Stats API. Each request has information about what happened before and after it was executed. measures to find out how much energy each process uses.

After endpoint testing, measurements are combined and carbon emissions computed using the global average grid carbon intensity (475 gCO2eq/kWh). After that, the container is taken apart and the volumes are cleaned. make sure the following stack has a clean place to work. After all seven stacks full testing, statistical analysis and visualization across stacks generation yields comparative outcomes.

Figure 1{reference-type="ref"} shows this full workflow. reference="fig:experiment-flow"} shows this full workflow, indicating the order of execution, decision points, and iteration structure that processes all seven technology stacks in the same way ways to measure.

Experimental workflow illustrating the whole process done for each technology stack This is

The experimental technique comprises six primary components executed in order:

Phase 1 - Pre-Validation: RAPL is part of infrastructure validation. Access to the MSR register, the Docker daemon, and port checks (8081–8087, 5432) Connecting to PostgreSQL and setting up seed data (100 users, 50 addresses, 200 carbon credits, and 150 donations.

Phase 2: Build and Deploy: Docker images are built and containers are started. set limits (2 CPU, 2GB RAM), and checked for health with GET /health. $$t_{startup} = t_{ready} - t_0$$ is the time it takes to start up.

Phase 3 - Baseline: A five-minute period of inactivity sets the baseline. Energy: $$E_{baseline} = E_{end} - E_{start}$$ Memory: $M_{baseline}$$ through "docker stats."

Step 4: Testing Each Endpoint: 15 endpoints × 2 scenarios = 30 tests/stack times 7 stacks equals 210 total. Metrics for each request: $$\Delta E_i = E_{after,i} - E_{before,i}$$ SR = \frac{n_{success}}{n_{total}} \times 100% Memory $M_i$ (MB) and $t_{response,i}$ (ms). Time to cool down: 2 seconds.

Phase 5: Aggregation: Metrics calculated for each stack: $$E_{total} = \sum_{i=1}^{n} \Delta E_i \text{ (kWh)}$$ \bar{t}{response} = \frac{1}{n}\sum{i=1}^{n} t_{response,i} \text{ (ms)} $$M_{peak} = \max(M_i) \text{ (MB)}$$ $$CO_2eq = E_{total} \times 475 \text{ gCO}_2\text{eq/kWh}$$ Container To take it down, use "docker-compose down -v."

Phase 6 - Analysis: Statistical measures: $$\mu = \frac{1}{n}\sum x_i$$ $$\sigma = \sqrt{\frac{1}{n}\sum(x_i - \mu)^2}$$ Percentiles: $P_{50}, P_{95}, P_{99}$. Results: 105 records (CSV/JSON), 11 visualizations (in PNG or PDF format).

Definitions of Variables

The mathematical equations use the following symbols: $t$ shows time data in seconds (startup time $t_{startup}$, The letter $E$ stands for energy measures in kilowatt-hours (baseline energy $E_{baseline}$, total energy $E_{total}$, energy delta per request $\Delta E_i$; $M$ shows memory use in megabytes (baseline memory $M_{baseline}$, peak memory $M_{peak}$) $M_{peak}$ is the peak memory, while $M_i$ is the current memory snapshot. $SR$ stands for The success rate is the number of successful requests as a percentage; $\mu$ stands for The letter $\sigma$ stands for standard deviation, while the letter $P_{x}$ stands for percentile values where $x$ is the percentile rank (50th, 95th, or 99th); The letter "n" stands for the number of requests (the sample size), and the letter "i" stands for the request. The index goes from 1 to $n$, and $CO_2eq$ stands for carbon dioxide. calculated using the global average grid, equivalent emissions in grams The carbon intensity factor is 475 gCO$_2$eq/kWh.

Test Cases

Table 3{reference-type="ref" reference="tab:test-scenarios"} shows the good and bad tests. scenarios run for each technology stack. Each of the 15 CRUD Endpoints are tested in two ways: affirmative cases check that the predicted Behavior with correct inputs, while negative cases check for correct mistake dealing with bad inputs, duplication, missing resources, etc badly formed data.

::: {#tab:test-scenarios} Endpoint Positive Case Negative Case


POST /api/users Valid user data Invalid email, duplicate GET /api/users/{id} User already exists User does not exist (404) PUT /api/users/{id} Valid updates Invalid phone format DELETE /api/users/{id} User already exists; User does not exist (404) GET /api/users Pagination Invalid arguments POST /api/addresses Valid address Invalid postal code

: Test Scenarios (Good and Bad Cases) :::

Gathering and Analyzing Data

All measurements are taken in an organized way:

  • Raw Data: JSON files for each stack that contain per-request measurements

  • Summary Data: a CSV file that combines metrics from all requests

  • Statistical Analysis: Mean, median, standard deviation, and 95th percentile

Outcomes

We did thorough tests on all seven technologies. stacking and testing 15 CRUD endpoints, for a total of 105 tests. Every stack made PostgreSQL database work the same way as the API integration, making sure that the comparison is fair. We used to collect measurements Intel RAPL for energy use, with carbon emissions figured out utilizing the global average grid carbon intensity of 475 gCO2eq/kWh.

Analyzing Startup Time

The time it takes to start up has a big effect on deployment costs and user experience. especially in systems without servers and with containers. Figure 2{reference-type="ref" reference="fig:startup"} shows the times it takes to start up in the cold across all frameworks.

Comparison of startup times across seven backends frameworks{#fig:startup width="70%"

Python FastAPI had the fastest startup time at 1.010 seconds, followed by Go Gin (1.005s) and JavaScript Express (1.021s) are quite close behind. On the other hand, At starting, Java Spring Boot was the slowest. 20.540s, which is about 20 times slower than Python FastAPI. Java Quarkus, optimized for cloud-native installations and has a startup time of 6.586 seconds, a 68% improvement over Spring Boot, but still 6.5 times slower than FastAPI.

These results show how much more work JVM-based frameworks as they start up, especially when they need to quickly growing or starting up often while it's chilly.

How well the response time works

Measurements of response time were taken at all 15 CRUD endpoints. in a regular way. Figure 3{reference-type="ref" reference="fig:response"} shows how the reaction time is spread out by layer.

How long it takes for different technologies to respond stack{#fig:response 70% width

The average response time for Python FastAPI was the lowest at 4.14ms. showing great performance during runtime even though it's an language that is interpreted. JavaScript Express (8.32ms) and Go Gin (15.21ms) followed, and Rust Axum had an average response time of 23.45 ms. Java Spring Boot and other JVM-based frameworks had higher latencies. (106.31ms) and Java Quarkus (111.40ms), which is 25–27 times slower. answers with relation to FastAPI.

How Much Energy It Uses

Intel RAPL records energy usage measurements, which show the the amount of electricity needed to run all of the test cases. Figure 4{reference-type="ref" reference="fig:energy"} shows overall energy use for each framework.

Comparison of total energy use (µWh){#fig:energy width="65%"}

The least amount of energy was used by Python FastAPI (730 µWh), then JavaScript Express (829 µWh) and Go Gin (934 µWh) are two examples. Compiled native Rust Axum used 1,689 µWh of energy, which is 2.3 times more than Python. Java Virtual Machine Java Quarkus (2,911 µWh) and Java frameworks used more energy. Spring Boot (3,319 µWh). The most energy-efficient language was C# .NET. usage at 4,879 µWh, which is 6.7 times more than Python FastAPI.

Carbon Emissions

To get the carbon emission estimates, we multiplied the amount of energy used The global average grid carbon intensity is 475 gCO2eq/kWh, as indicated. in Figure 5{reference-type="ref" reference="fig:carbon"}

Total carbon emissions (gCO2eq) per framework{#fig:carbon width="65%"}

The ranks of carbon emissions are similar to the patterns of energy use: Python FastAPI (0.347 gCO2eq), JavaScript Express (0.394 gCO2eq), and Go Gin (0.444 gCO2eq) gCO2eq), Rust Axum (0.802 gCO2eq), Java Quarkus (1.383 gCO2eq), Java C# .NET (2.318 gCO2eq) and Spring Boot (1.577 gCO2eq). Choosing Python FastAPI instead of C# .NET would cut carbon emissions by 85% for the same amount of work. loads of work.

Analysis of CRUD Operations

Figure 6{reference-type="ref" reference="fig:crud_heatmap"} shows a performance heatmap that shows response times for different CRUD operations (Create, Read, Update, Delete, List, and all seven technological stacks. The strength of the color shows how long it takes to respond, making it easy to find performance problems and showing where Python consistently shows up FastAPI always performs better than other sorts of operations.

CRUD operations performance heatmap (response time in ms){#fig:crud_heatmap 70% of the width

All frameworks demonstrated the same trends of performance over time. types, with List Users operations usually needing a little more response times because database queries are hard. FastAPI for Python kept up great performance for all CRUD operations.

Comparison of Performance in Multiple Dimensions

Figure 7{reference-type="ref" reference="fig:radar"} gives a normalized multi-metric assessment of important performance dimensions: time to start up, time to respond, energy efficiency, and carbon strength.

Normalized multi-dimensional performance radar chart metrics){#fig:radar width="65%"

Python FastAPI has the most balanced profile and the best scores in every area. Go Gin is very energy efficient with Not too long to start up. Java frameworks don't start up very quickly. yet reasonable runtime features. C# .NET doesn't work as well as it should across most dimensions.

Overall Framework Ranking

Table 4{reference-type="ref" reference="tab:ranking"} shows the full ranking for all the metrics that were measured, combining starting time, response time, energy use, and information about carbon emissions. The table puts frameworks in order from most to least efficient. (Python FastAPI) to least efficient (C# .NET), showing a big difference performance differences between different runtime architectures and showing that interpreted languages can be better than compiled ones options for certain types of job.

:: {#tab:ranking} Stack Startup (s) Response (ms) Energy (µWh) CO2 (gCO2eq)


Python FastAPI 1.010 4.14 730 0.347 JavaScript Express 1.021 8.32 829 0.394 Go Gin 1.005 15.21 934 0.444 Rust Axum: 0.004, 23.45, 1,689, 0.802 Java Quarkus: 6.586; 111.40; 2,911; 1.383 Java Spring Boot: 20.540, 106.31, 3,319, 1.577 C# .NET 2.182 74.75 4,879 2.318

: Complete rating of framework performance :::

Talk

What the Results Mean

Effect on Runtime Architecture

Contrary to common beliefs, compiled native languages always do better than interpreted options when it comes to energy efficiency, Our results show more complex patterns. Python FastAPI showed the least amount of energy used (730 µWh) and the quickest response times (4.14ms), which goes against the idea that built languages do it automatically. translate to better energy efficiency.

There are a few reasons why this happened: (1) Python's mature runtime optimizations in CPython 3.12, (2) FastAPI's an efficient async I/O implementation that keeps the CPU from blocking, and (3) the The test burden is light, which makes Python's optimized C perform better. add-ons for working with databases. Rust Axum, on the other hand, used 2.3× more energy even with native compilation, which means that framework-level Optimizations and I/O handling patterns have a big effect on the overall efficiency that goes beyond only the compilation strategy.

Java Spring Boot and Quarkus are examples of JVM-based frameworks that showed the the most latency (106–111 ms) and a lot of energy use (2,911–3,319 µWh). The JVM's JIT compilation warmup time, along with The overhead of garbage collection during execution was a factor in both longer response times and more energy use. Quarkus stood apart by very little energy advantage over Spring Boot in JVM mode, which means that When confined, framework-level optimizations don't offer much value. by the properties of the runtime that are underneath.

Trade-offs Between Startup Time and Runtime Performance

The 20.540s starting time of Java Spring Boot is a major problem. for cloud-native deployments that need to grow quickly. Even though Quarkus It cuts startup time down to 6.586 seconds (68% faster), however it's still 6.5 times slower than FastAPI for Python (1.010s). This difference has a direct effect on:

Serverless Economics: Longer cold starts make it more expensive time to run

  • Auto-scaling Efficiency: Starting up late makes it less responsive. to sudden increases in traffic

  • Development Iteration: Feedback loops that take longer during local growth

It's interesting that runtime performance doesn't make up for startup. overhead—Java frameworks had the slowest setup and greatest reaction latencies, which point to systemic problems instead of than trade-offs between startup and runtime.

Recommendations and Insights for Each Framework

Python FastAPI is very efficient because to uvicorn ASGI. event loop, Cython validators from Pydantic, and a link to SQLAlchemy pooling. Recommended for: Deployments that are cloud-native, applications that are important for sustainability (85% less pollution than C# .NET), APIs that do a lot of CRUD.

Go Gin has a moderate amount of energy (934 µWh) and starts up quickly. (1.00s). Recommended for: Microservices designs that need predictable use of resources and deployments that happen often.

Rust Axum has more energy (1,689 µWh), which means it can run async (Tokio). and ways to make the database driver (SQLx) work better.

JVM frameworks are not very efficient as a whole: Java Spring Boot (20.54s to start up, 3,319 µWh), Quarkus (6.59s to start up, 2,911 µWh). Think about for: Legacy enterprise ecosystems that already have JVM infrastructure, but want to move to GraalVM native images for better long-term support.

C# .NET doesn't do well on any of the criteria (4,879 µWh, highest power). CLR garbage collector and framework abstractions are available problems with sustainability.

For Legacy Business Ecosystems

Recommendation: Java Quarkus over Spring Boot

Companies that can only use JVM environments should switch from Spring Boot to Quarkus to start up 68% faster. But runtime Energy use stays about the same, which suggests JVM-level To make things better, you need to do things like optimize (GraalVM native image, G1GC tuning). come close to the efficiency of solutions that aren't JVM.

Not Recommended: C#.NET

Unless some ecosystem requirements require .NET, there are other options. Frameworks are more energy-efficient (2–6 times better) and performance (response times that are 2 to 18 times faster).

Limits

The workload is mostly on CRUD activities and results that are particular to hardware (Intel Core i7-1270P; RAPL only assesses the power of the CPU package; framework results that depend on the version.

Conclusion and Future Work

This study assessed Software Carbon Intensity across seven backend systems. frameworks, running 105 CRUD tests with PostgreSQL. The most important results are: (1) Python FastAPI used the least energy (730 µWh) and had the quickest response time. (4.14ms) and very low emissions (0.347 gCO2eq), which is better than compiled alternatives; (2) JVM frameworks demonstrated structural inefficiencies with delayed starts (6.586–20.540 seconds) and high energy (2.911–3.319 µWh); (3) 6.7× energy variance shows that there is an 85% chance of reducing carbon through framework choices; (4) 20× difference in startup time makes JVM not good for cloud-native installations without GraalVM native making a copy.

What This Means and What Comes Next

For people who work in the field, think about the operational carbon footprint coupled with speed of development when choosing technological stacks. Profile production apps to find places where energy is high. Use a hybrid methods that use energy-efficient languages for tasks that need a lot of computing power microservices while keeping useful languages for less important tasks parts.

Future study could investigate: (1) more extensive workload categories (batch processing, streaming, ML inference), (2) a cloud that can be used by more than one person environments that can grow and shrink as needed, (3) keeping track of how frameworks change over time between versions, (4) the effect of coding styles on energy use, (5) carbon-aware workload scheduling methods, and (6) IDE plugins provide feedback on energy use in real time.

Final Thoughts

As digital revolution speeds up and action on climate change becomes Green Software Engineering must evolve from a niche to a more pressing issue. concern to a basic rule of making software. This Studies show that the choice of technology has a big effect on sustainability of the environment, and that choices based on facts can significantly lower the carbon footprint of business processes.

By making our whole dataset, measuring tools, and analytic scripts We think that by making this information available to the public, we will encourage more research and give people more control. to help professionals make smart, long-lasting technology choices.

Thanks {#thanks.unnumbered}

The authors are thankful for the financial help they got from STMIK Tazkia for sponsoring the publishing, and Jejakin.com, a climate technology company, for offering research facilities, computing power, knowledge of resources and how to calculate carbon. We are grateful to the open-source the people who work on Spring, Quarkus, Gin, Axum, FastAPI, Express, and Frameworks for .NET.

::: making things abstract This research does an extensive assessment and juxtaposition of Software Carbon Intensity (SCI) for seven modern backend frameworks: Java Spring Boot, Java Quarkus, Go Gin, Rust Axum, and Python FastAPI JavaScript Express and C# .NET. In the time of huge digital The choice of backend technology has an effect on more than only the functioning of the application and the digital carbon footprint that comes from it. We use a controlled experimental method to detect energy The amount of energy used, the time it takes to respond, the amount of memory used, and the amount of carbon dioxide released by each framework while doing the same CRUD activities. The results show big variations in how energy-efficient: compiled native languages Compared to interpreted languages, Rust and Go are 40–60% more efficient. languages like Python and JavaScript, as well as VM-based languages with JIT Compilers like Java and C# are in the middle. This study offers empirical advise based on evidence for industry professionals that want to embrace Green Coding methods that work to lessen the environmental impact of systems for businesses. :::

Introduction

Background and Reason for Doing It

Climate change has become one of the most important problems for people to solve. in the 21st century, thanks to the Information and Communication Technology The ICT sector is becoming more and more important. Estimates right now show that the ICT industry makes up about 1.8% to 3.9% of a number that is similar to the aviation industry's greenhouse gas emissions. business [@Freitag2021]. This contribution is expected to becoming bigger. a lot as digital transformation speeds up in all areas of the world's economy.

The rapid expansion of cloud computing and microservices architectures and API-driven apps have made backend services more in demand than ever. Services. Data centers all throughout the world work around the clock to service these services, using a lot of electricity and making a lot of carbon emissions. In this situation, the decision of programming Languages and frameworks for backend development become more than just a a technological choice, but also a necessity for the environment [@Koenig2020; @Wasif2024].

Green Software Engineering (GSE) has become an important field to make software systems that use less energy and sustainable for the environment [@Ardito2015]. A key measure in GSE is Software Carbon Intensity (SCI), which measures the carbon the amount of emissions produced for each functional unit of software operation. Even though Earlier studies have investigated algorithmic optimization for energy. efficiency within individual programming languages (GSE Type 1 at the algorithmic level) [@Karisma2026; @Lannelongue2021], thorough comparison research analyzing the impact of various backend frameworks on There are still not many SCI values in real business workloads. This research applies GSE Type 1 concepts to the level of framework selection, assessing the natural differences in efficiency between seven modern technology stacks running the same business logic.

The Problem of Choosing Technology

Modern businesses commonly use backend technologies based on mainly based on things like how well the developer knows the ecosystem, or how popular it is in the community, without thinking about the long-term effects on the environment [@GreenAdoption2024]. This method doesn't take into account the big disparities in how much energy different programming paradigms use paradigms:

  • Interpreted Languages: Python and JavaScript (Node.js) are the most popular. current development because of the ability to quickly prototype and lots of books. But their runtime interpretation overhead frequently leads to greater energy usage than compiled other options [@Pereira2017; @PythonMicroscope2025; @vanKempen2024].

  • Compiled Native Languages: Languages like Rust and Go that are based on come from the C/C++ family and compile directly to machine code with minimal overhead at runtime. Rust's memory management based on ownership supposedly eliminates the need for waste pickup, offering better energy efficiency [@RustOwnership2023].

  • Virtual Machine (VM) Based Languages: Java and C# run on advanced virtual machines (JVM and CLR, respectively) that use Compilation that happens just in time (JIT). These platforms are portable. and ecosystems that are already well-developed, but they add memory management overhead. different methods for collecting waste [@JavaJIT2024; @DotNetCLR2024].

What We Want to Find Out and Why

Even if more people are learning about the fundamentals of Green Software Engineering, There are still some important knowledge gaps:

  1. Comprehensive Framework Comparison: The majority of current studies concentrate on comparisons at the language level using benchmark programs instead of realistic workloads for applications [@Pereira2017; @ProgrammingRank2024].

  2. Per-Endpoint Analysis: There is not much study on energy consumption at the level of individual API endpoints (CRUD operations), which is necessary for finding ways to improve chances.

  3. Positive vs. Negative Cases: The effect of error management on energy and validation (negative test cases) in contrast to successful operations The exploration of good cases remains unaddressed.

  4. Effect on Startup Time: The energy use of an application when it first starts up, important for serverless and containerized deployments, is not often quantified.

This research fills these voids by doing actual measurements of SCI across seven contemporary backend frameworks exemplifying diverse paradigms for runtime. We not only look at total energy use but also Also, metrics per request, startup overhead, and profiling at the function level to find places where energy is high.

Contributions to Research

This study adds to the field of Green in the following ways: Engineering for Software:

    1. Comprehensive Empirical Data: Thorough assessments of energy memory utilization, carbon emissions, response time, and power usage spans seven production-ready backend frameworks that all do the same thing business rules.
  1. Per-Endpoint Granularity: A detailed look into CRUD activities like Create, Read, Update, and Delete for more than one domain entities (Users, Addresses, Carbon Credits, Trees Donations).

  2. Positive and Negative Case Analysis: Energy comparison utilization between successful operations and error handling situations.

  3. Practical Recommendations: Software advice based on data architects choosing technology for businesses that will last systems.

    1. Open Dataset: All of the experimental data, measurement scripts, and analyzing tools that the research community can use to check their work and an extension.

The rest of this paper is set up like this: Section 2 shows theoretical foundation of Green Software Engineering and programming language traits; Section 3 goes into more information about our experiment technique; Section 4 shows full results; Section 5 talks about the results and their effects; and Section 6 ends with future Directions for research.

Theoretical Background and Related Work

Principles of Green Software Engineering

Green Software Engineering is a discipline that deals with the important necessity to lower the environmental impact of computer systems over their full development and operational life cycle [@Ardito2015; @Koenig2020]. This new field is based on three basic ideas that are connected:

    1. Energy Efficiency: Using less electricity needed for software to work, which cuts down on carbon dioxide pollution that comes from using electricity while the computer is running.
  1. Hardware Efficiency: Making the most of the hardware you already have computational infrastructure with better software design, It lowers the need for more physical hardware deployment.

  2. Carbon Awareness: Knowing about grid electricity carbon intensity into decisions about how to schedule computations, which Workloads can run when renewable energy sources are in charge of the power blend.

The basic truth about computing is that every command to the processor uses electricity, and this use of electricity means carbon emissions depend on how the electricity was made [@Pathania2024]. So, making software that works well that lessens the number of processing cycles, cuts down on memory allocation overhead, and optimizes input/output activities is a direct approach to lessening the influence on the environment.

Green Software Engineering Taxonomy: GSE Type 1 and Type 2

Recent academic research has developed a conceptual taxonomy for putting software's relationship with environmental sustainability into groups, differentiating between two essential methodologies for green computing [@Karisma2026]:

  • GSE kind 1 (Green in Software Engineering): This is the first kind talks about how efficient software systems are on their own, with a focus on the lessening of the use of computing resources through improvement of architecture and algorithms. The main goal focuses on improving performance traits that directly translate to less energy use and a smaller carbon imprint. Algorithmic complexity is one example of an intervention. cutting down (for example, $O(n^2) \rightarrow O(n \log n)$), inquiry improving database systems, reducing memory usage, and increases to the efficiency of the execution path.

  • GSE Type 2 (Green by Software Engineering): This is a type of covers programs that were made just to help and keep an eye on sustainability efforts in both businesses and society as a whole. Instead of Instead of making the software use its own resources more efficiently, these systems work as tools to help the environment goals like carbon neutrality, following ESG rules, and climate action. Emission monitoring platforms are common examples. methods for managing the grid for renewable energy, sustainability metrics dashboards and methods for keeping track of carbon emissions.

These two groups are different but work well together: Type 1 focuses on the direct environmental costs of running software. Type 2, on the other hand, uses software features to make and speed up efforts to make the outside world more sustainable.

The current study focuses solely on GSE Type 1 issues, examining the impact of technology stack selection—particularly the The architecture of the programming language runtime and web framework has an effect on the basic energy efficiency of a backend that works the same way systems. Previous Type 1 research investigates algorithmic optimization. prospects inside a single linguistic environment [@Karisma2026], our contribution provide cross-language energy measures that can be compared at the framework layer, which helps businesses make decisions about their architecture creating applications.

SCI, or Software Carbon Intensity

The Green Software Foundation came up with the Software Carbon Intensity metric. Foundation offers a defined method for measuring software. carbon emissions. The basic SCI formula is:

$$SCI = \frac{(E \times I) + M}{R} \label{eq:sci}$$

Where:

– $E$ = the amount of energy the software system uses (kWh)

  • $I$ is the carbon intensity of the power grid in the area (gCO2eq/kWh)

  • $M$ = the amount of carbon emissions that are built into making hardware

– $R$ = functional unit (for example, per API request or per user session)

This study largely examines operational efficiency. The embodied carbon $M$ stays the same across all of them, therefore $(E \times I)$ is the same. experiments utilizing same hardware infrastructure. The carbon in the area The global average for intensity $I$ is 475 gCO2eq/kWh. from the source material [@Lannelongue2021].

Energy Efficiency of Programming Languages

Recent research has thoroughly examined energy use across languages for programming. Pereira et al. [@Pereira2017] did groundbreaking research that looks at 27 programming languages across 10 benchmarks difficulties, discovering that compiled languages such as C, C++, and Rust consistently did better than interpreted languages by a range of 2x to 75x when it comes to how much energy it uses.

More recent research has given us more information:

  • Language Rankings: New rankings based on energy readings confirm that compiled native languages (C, C++, Rust, Go) are at the top for energy efficiency, while interpreted languages are the best for energy efficiency, while translated languages (Python, Ruby, Perl) use a lot more energy. [@ProgrammingRank2024].

  • Python-Specific Analysis: A close look at Python Execution methods show that the CPython interpreter adds overhead. adds a lot to energy use, together with PyPy (JIT) and Cython (compiled) giving enhancements [@PythonMicroscope2025].

Green AI Perspective: Looking at machine learning workloads shows that choosing the right language has a big effect on the The carbon burden of training and using AI models [@Marini2024].

Features of the Backend Framework

This study analyzes seven frameworks that embody three unique runtime environments. types of architecture:

Compiled Native Languages

Go (Gin Framework): Google made Go, which is a statically typed language. a compiled language made for writing programs that run at the same time. Important Some of its traits are:

  • Lineage: Comes from the C family and is compiled into native machine code.

  • Memory Management: A garbage collector that marks and sweeps at the same time optimized for reduced latency [@GoGC2023]

  • Concurrency: Lightweight goroutines make it easy to handle requests at the same time

  • Performance Profile: Starts up quickly and uses little memory. performance that can be counted on

Rust (Axum Framework): Rust focuses on memory safety without garbage collecting through its ownership system [@RustOwnership2023]. Some of the traits are:

  • Lineage: A programming language for systems that was influenced by C++. compiled into native code

  • Memory Management: Checking ownership at compile time gets rid of GC overhead at runtime

  • Safety Guarantees: Stops frequent memory mistakes like null pointers, data races at compile time

  • Performance Profile: Similar to C/C++, no cost abstractions, low runtime

JIT Compilation in a Virtual Machine

Java (Spring Boot and Quarkus): The Java Virtual Machine runs Java. Machine (JVM) with advanced Just-In-Time compilation:

  • Lineage: Object-oriented language with a mature environment (25+ years)

  • Memory Management: Generational trash collection with more than one algorithms for collecting [@JavaJIT2024]

  • JIT Compilation: The HotSpot compiler makes code that is run often run faster. code routes when the program is running

  • Spring Boot: A traditional JVM framework with a lot of features set of features

  • Quarkus: A modern framework that works best with containers and supports native compilation with GraalVM

C# (.NET): Runs on the Common Language Runtime (CLR) with sophisticated JIT compilation:

  • Lineage: Made to blend the speed of C++ with the look and feel of Java memory that is managed

  • Memory Management: Generational GC for both workstations and servers modes [@DotNetCLR2024]

  • JIT Compilation: The RyuJIT compiler has layered compilation for best performance

  • Performance Profile: Works well with Java and has a contemporary runtime architecture

Interpreted/JIT Hybrid Languages

Python (FastAPI): Python is a dynamically typed language that is mostly understood:

  • Lineage: A high-level language that puts developer efficiency first

  • Execution Model: CPython interpreter with a little amount of JIT improvement

  • Memory Management: Cycle-detecting GC with reference counting

  • Performance Profile: slower execution but faster development [@PythonMicroscope2025]

JavaScript (Express.js): The V8 engine runs it with advanced JIT compilation:

  • Lineage: It started as browser scripting and is now everywhere. Node.js

  • Execution Model: the TurboFan JIT compiler and the V8 engine

  • Memory Management: collecting junk from different generations

  • Event Loop: A single-threaded, event-driven architecture for input and output operations

Related Work on Measuring Energy

There are a few different ways to measure how much energy software uses:

  • Hardware Instrumentation: Using power meters and other specialized tools Sensors give precise readings, but they need to be physically accessible. to infrastructure [@Chowdhury2018].

  • RAPL Interface: Intel's Running Average Power Limit (RAPL) gives the CPU package's power use using model-specific registers, giving processor-bound workloads a lot of accuracy.

  • Software Estimation Models: CodeCarbon and other tools like it GreenScaler figures out how much energy a device uses based on its hardware. utilization metrics [@CodeCarbon; @Chowdhury2018].

Recent research has emphasized the significance of evaluating the full The energy footprint of the software development lifecycle [@Rahman2022] and working together to manage green business processes [Jakobi2016].

Methodology

This study utilizes a controlled experimental approach to guarantee equity. comparison of seven technological stacks. Each framework runs the same business logic and the same functional needs.

Design of the Experiment

Our experimental design employs a methodical framework for measurement and compare the energy efficiency of seven backend frameworks under under regulated settings. The investigation utilizes a between-subjects design. When each framework signifies a distinct treatment group, removing biases and learning effects that could happen in within-subjects designs.

The experimental setup guarantees internal validity by rigorous control of confounding variables: all frameworks work with the same hardware, do the same work, and talk to the same database setting up. Implementing a external validity is how you deal with it. a realistic e-commerce app with business logic that makes sense (user management, handling addresses, carbon credit transactions, and tree tracking contributions), which makes the results applicable to comparable systems for making things.

To make sure that each test case is reliable, it is run several times. in a variety of execution settings (both positive and bad), Statistical approaches are used to combine the measurements. The The experimental approach is completely mechanized and recorded, making it possible for full reproducibility by researchers who aren't affiliated with the original study.

The research design adheres to the Goal-Question-Metric (GQM) framework. [@Basili1994GQM]: Our goal is to assess energy efficiency; our Questions ask which framework uses the least energy, works the best, and produces the least carbon emissions. performance, and lowest carbon emissions; our metrics include CPU, memory utilization (MB), reaction time (ms), and energy use (kWh) Use (%), carbon emissions (gCO2eq), and startup time (seconds).

Variables for Research

Table 1{reference-type="ref reference="tab:research-variables"} gives a full picture of all the research factors used in this study. The matrix sorts 19 variables into three groups: independent variables (7 technology stacks), dependent variables (six measures of performance and sustainability), and control variables (6 environmental constants), with matching operational definitions and ways to measure each one.

::: {#tab:research-variables} Type Variable Operational Definition Measurement


Independent Variables (Choosing a Technology Stack)
IV-1 Java Spring Boot 3.2: A traditional JVM framework with a lot of capabilities. Framework selection IV-2 Java Quarkus 3.6: A cloud-native JVM framework that works best in containers. IV-3 Go Gin 1.9 Compiled native with HTTP web framework Framework selection IV-4 Rust Axum 0.7 Compiled native with async web framework Framework selection IV-5 Python FastAPI 0.104: An asynchronous ASGI framework for Python 3.12. Framework selection IV-6 JavaScript Express 4.18 Node.js web application framework Framework selection IV-7 C# .NET 6.0 CLR-based framework with RyuJIT compiler Framework selection Dependent Variables (Metrics for Performance and Sustainability)
DV-1 Energy Usage: The amount of electrical power used during execution (kWh) via Intel RAPL MSR registers DV-2 Response Time Time between request and response (ms) Application timestamps DV-3 Memory Usage Peak and average RAM utilization (MB) Docker Stats API DV-4 CPU Utilization Percentage of CPU cycles used (%) Docker Stats API DV-5 Carbon Emissions: CO2eq emissions from energy (gCO2eq) = E × 475 gCO2/kWh DV-6: Startup Time: The time it takes for a container to start and pass a health check (s) and the time stamps for the containers Control Variables (Constants in the Environment)
CV-1 Database System: PostgreSQL 15.0 with the same schema and a fixed configuration CV-2 Hardware: Intel Core i7-1270P (12 cores, max 4.8 GHz), RAPL-enabled Physical machine OS Environment: Ubuntu 22.04 LTS (Jammy Jellyfish) System environment Container Runtime: Docker 24.0.6 with containerd Containerization CV-5 Resource Limits: 2 CPU cores, 2GB of RAM, and no swap. These are Docker limits. CV-6 Network Localhost bridge (no external latency) Docker networking

: Matrix of Research Variables :::

How to Build an Application

The following is the same RESTful API for each framework: Features:

  • Domain Model: There are four entities: User, Address, CarbonCredit, and TreeContribution) that have relationships

  • Business Logic: CRUD operations, validation, and authentication searches for aggregation

  • Database Operations: Connection pooling, prepared statements, managing transactions

  • Validation: the format of Indonesian phone numbers and postal codes validation, email validation

  • Security: BCrypt password hashing and cleaning up input

All implementations follow SOLID principles and best practices for their their own ecosystems.

Infrastructure for Measurement

Table 2{reference-type="ref" reference="tab:measurement-tools"} talks about the equipment used to measure and their particular roles used in this research. The infrastructure combines Intel RAPL for monitoring energy at the hardware level with containerization metrics (Docker Stats), configurable performance logging, and stack-specific profiling tools to get a full picture of performance and sustainability information.

:: {#tab:measurement-tools} Category Tool Function


Energy RAPL CPU package power use Memory Docker Stats Container memory consumption Performance, Custom Logger, and measuring response time Load Generation, Custom Script, and Concurrent Request Simulation Profiling: Function-level analysis for certain stacks

: Tools and Functions for Measurement ::

How the Experiment Worked

The experimental approach adheres to a structured six-phase workflow. made to make sure that all measurements are the same and can be repeated sets of technology. All frameworks go through the same tests. with automated orchestration removing the need for human control, human bias and changes in time.

The first step in the workflow is to check the environment to make sure RAPL energy access to measurement, availability of Docker infrastructure, and database connectivity. After successful validation, each framework's The container image is produced and deployed with standardized resources. limits (2 CPU cores and 2GB of RAM). A baseline after deployment The measurement phase sets the amount of idle resources that are being used, which gives reference points for figuring out the resource deltas for each request.

The core testing phase runs all 15 CRUD endpoints in a planned way. provides both good and bad test cases. Energy usage is recorded in microseconds using Intel RAPL registers, while Response times, memory consumption, and CPU usage are recorded through Docker Stats API. Each request has information about what happened before and after it was executed. measures to find out how much energy each process uses.

After endpoint testing, measurements are combined and carbon emissions computed using the global average grid carbon intensity (475 gCO2eq/kWh). After that, the container is taken apart and the volumes are cleaned. make sure the following stack has a clean place to work. After all seven stacks full testing, statistical analysis and visualization across stacks generation yields comparative outcomes.

Figure 1{reference-type="ref"} shows this full workflow. reference="fig:experiment-flow"} shows this full workflow, indicating the order of execution, decision points, and iteration structure that processes all seven technology stacks in the same way ways to measure.

Experimental workflow illustrating the whole process done for each technology stack This is

The experimental technique comprises six primary components executed in order:

Phase 1 - Pre-Validation: RAPL is part of infrastructure validation. Access to the MSR register, the Docker daemon, and port checks (8081–8087, 5432) Connecting to PostgreSQL and setting up seed data (100 users, 50 addresses, 200 carbon credits, and 150 donations.

Phase 2: Build and Deploy: Docker images are built and containers are started. set limits (2 CPU, 2GB RAM), and checked for health with GET /health. $$t_{startup} = t_{ready} - t_0$$ is the time it takes to start up.

Phase 3 - Baseline: A five-minute period of inactivity sets the baseline. Energy: $$E_{baseline} = E_{end} - E_{start}$$ Memory: $M_{baseline}$$ through "docker stats."

Step 4: Testing Each Endpoint: 15 endpoints × 2 scenarios = 30 tests/stack times 7 stacks equals 210 total. Metrics for each request: $$\Delta E_i = E_{after,i} - E_{before,i}$$ SR = \frac{n_{success}}{n_{total}} \times 100% Memory $M_i$ (MB) and $t_{response,i}$ (ms). Time to cool down: 2 seconds.

Phase 5: Aggregation: Metrics calculated for each stack: $$E_{total} = \sum_{i=1}^{n} \Delta E_i \text{ (kWh)}$$ \bar{t}{response} = \frac{1}{n}\sum{i=1}^{n} t_{response,i} \text{ (ms)} $$M_{peak} = \max(M_i) \text{ (MB)}$$ $$CO_2eq = E_{total} \times 475 \text{ gCO}_2\text{eq/kWh}$$ Container To take it down, use "docker-compose down -v."

Phase 6 - Analysis: Statistical measures: $$\mu = \frac{1}{n}\sum x_i$$ $$\sigma = \sqrt{\frac{1}{n}\sum(x_i - \mu)^2}$$ Percentiles: $P_{50}, P_{95}, P_{99}$. Results: 105 records (CSV/JSON), 11 visualizations (in PNG or PDF format).

Definitions of Variables

The mathematical equations use the following symbols: $t$ shows time data in seconds (startup time $t_{startup}$, The letter $E$ stands for energy measures in kilowatt-hours (baseline energy $E_{baseline}$, total energy $E_{total}$, energy delta per request $\Delta E_i$; $M$ shows memory use in megabytes (baseline memory $M_{baseline}$, peak memory $M_{peak}$) $M_{peak}$ is the peak memory, while $M_i$ is the current memory snapshot. $SR$ stands for The success rate is the number of successful requests as a percentage; $\mu$ stands for The letter $\sigma$ stands for standard deviation, while the letter $P_{x}$ stands for percentile values where $x$ is the percentile rank (50th, 95th, or 99th); The letter "n" stands for the number of requests (the sample size), and the letter "i" stands for the request. The index goes from 1 to $n$, and $CO_2eq$ stands for carbon dioxide. calculated using the global average grid, equivalent emissions in grams The carbon intensity factor is 475 gCO$_2$eq/kWh.

Test Cases

Table 3{reference-type="ref" reference="tab:test-scenarios"} shows the good and bad tests. scenarios run for each technology stack. Each of the 15 CRUD Endpoints are tested in two ways: affirmative cases check that the predicted Behavior with correct inputs, while negative cases check for correct mistake dealing with bad inputs, duplication, missing resources, etc badly formed data.

::: {#tab:test-scenarios} Endpoint Positive Case Negative Case


POST /api/users Valid user data Invalid email, duplicate GET /api/users/{id} User already exists User does not exist (404) PUT /api/users/{id} Valid updates Invalid phone format DELETE /api/users/{id} User already exists; User does not exist (404) GET /api/users Pagination Invalid arguments POST /api/addresses Valid address Invalid postal code

: Test Scenarios (Good and Bad Cases) :::

Gathering and Analyzing Data

All measurements are taken in an organized way:

  • Raw Data: JSON files for each stack that contain per-request measurements

  • Summary Data: a CSV file that combines metrics from all requests

  • Statistical Analysis: Mean, median, standard deviation, and 95th percentile

Outcomes

We did thorough tests on all seven technologies. stacking and testing 15 CRUD endpoints, for a total of 105 tests. Every stack made PostgreSQL database work the same way as the API integration, making sure that the comparison is fair. We used to collect measurements Intel RAPL for energy use, with carbon emissions figured out utilizing the global average grid carbon intensity of 475 gCO2eq/kWh.

Analyzing Startup Time

The time it takes to start up has a big effect on deployment costs and user experience. especially in systems without servers and with containers. Figure 2{reference-type="ref" reference="fig:startup"} shows the times it takes to start up in the cold across all frameworks.

Comparison of startup times across seven backends frameworks{#fig:startup width="70%"

Python FastAPI had the fastest startup time at 1.010 seconds, followed by Go Gin (1.005s) and JavaScript Express (1.021s) are quite close behind. On the other hand, At starting, Java Spring Boot was the slowest. 20.540s, which is about 20 times slower than Python FastAPI. Java Quarkus, optimized for cloud-native installations and has a startup time of 6.586 seconds, a 68% improvement over Spring Boot, but still 6.5 times slower than FastAPI.

These results show how much more work JVM-based frameworks as they start up, especially when they need to quickly growing or starting up often while it's chilly.

How well the response time works

Measurements of response time were taken at all 15 CRUD endpoints. in a regular way. Figure 3{reference-type="ref" reference="fig:response"} shows how the reaction time is spread out by layer.

How long it takes for different technologies to respond stack{#fig:response 70% width

The average response time for Python FastAPI was the lowest at 4.14ms. showing great performance during runtime even though it's an language that is interpreted. JavaScript Express (8.32ms) and Go Gin (15.21ms) followed, and Rust Axum had an average response time of 23.45 ms. Java Spring Boot and other JVM-based frameworks had higher latencies. (106.31ms) and Java Quarkus (111.40ms), which is 25–27 times slower. answers with relation to FastAPI.

How Much Energy It Uses

Intel RAPL records energy usage measurements, which show the the amount of electricity needed to run all of the test cases. Figure 4{reference-type="ref" reference="fig:energy"} shows overall energy use for each framework.

Comparison of total energy use (µWh){#fig:energy width="65%"}

The least amount of energy was used by Python FastAPI (730 µWh), then JavaScript Express (829 µWh) and Go Gin (934 µWh) are two examples. Compiled native Rust Axum used 1,689 µWh of energy, which is 2.3 times more than Python. Java Virtual Machine Java Quarkus (2,911 µWh) and Java frameworks used more energy. Spring Boot (3,319 µWh). The most energy-efficient language was C# .NET. usage at 4,879 µWh, which is 6.7 times more than Python FastAPI.

Carbon Emissions

To get the carbon emission estimates, we multiplied the amount of energy used The global average grid carbon intensity is 475 gCO2eq/kWh, as indicated. in Figure 5{reference-type="ref" reference="fig:carbon"}

Total carbon emissions (gCO2eq) per framework{#fig:carbon width="65%"}

The ranks of carbon emissions are similar to the patterns of energy use: Python FastAPI (0.347 gCO2eq), JavaScript Express (0.394 gCO2eq), and Go Gin (0.444 gCO2eq) gCO2eq), Rust Axum (0.802 gCO2eq), Java Quarkus (1.383 gCO2eq), Java C# .NET (2.318 gCO2eq) and Spring Boot (1.577 gCO2eq). Choosing Python FastAPI instead of C# .NET would cut carbon emissions by 85% for the same amount of work. loads of work.

Analysis of CRUD Operations

Figure 6{reference-type="ref" reference="fig:crud_heatmap"} shows a performance heatmap that shows response times for different CRUD operations (Create, Read, Update, Delete, List, and all seven technological stacks. The strength of the color shows how long it takes to respond, making it easy to find performance problems and showing where Python consistently shows up FastAPI always performs better than other sorts of operations.

CRUD operations performance heatmap (response time in ms){#fig:crud_heatmap 70% of the width

All frameworks demonstrated the same trends of performance over time. types, with List Users operations usually needing a little more response times because database queries are hard. FastAPI for Python kept up great performance for all CRUD operations.

Comparison of Performance in Multiple Dimensions

Figure 7{reference-type="ref" reference="fig:radar"} gives a normalized multi-metric assessment of important performance dimensions: time to start up, time to respond, energy efficiency, and carbon strength.

Normalized multi-dimensional performance radar chart metrics){#fig:radar width="65%"

Python FastAPI has the most balanced profile and the best scores in every area. Go Gin is very energy efficient with Not too long to start up. Java frameworks don't start up very quickly. yet reasonable runtime features. C# .NET doesn't work as well as it should across most dimensions.

Overall Framework Ranking

Table 4{reference-type="ref" reference="tab:ranking"} shows the full ranking for all the metrics that were measured, combining starting time, response time, energy use, and information about carbon emissions. The table puts frameworks in order from most to least efficient. (Python FastAPI) to least efficient (C# .NET), showing a big difference performance differences between different runtime architectures and showing that interpreted languages can be better than compiled ones options for certain types of job.

:: {#tab:ranking} Stack Startup (s) Response (ms) Energy (µWh) CO2 (gCO2eq)


Python FastAPI 1.010 4.14 730 0.347 JavaScript Express 1.021 8.32 829 0.394 Go Gin 1.005 15.21 934 0.444 Rust Axum: 0.004, 23.45, 1,689, 0.802 Java Quarkus: 6.586; 111.40; 2,911; 1.383 Java Spring Boot: 20.540, 106.31, 3,319, 1.577 C# .NET 2.182 74.75 4,879 2.318

: Complete rating of framework performance :::

Talk

What the Results Mean

Effect on Runtime Architecture

Contrary to common beliefs, compiled native languages always do better than interpreted options when it comes to energy efficiency, Our results show more complex patterns. Python FastAPI showed the least amount of energy used (730 µWh) and the quickest response times (4.14ms), which goes against the idea that built languages do it automatically. translate to better energy efficiency.

There are a few reasons why this happened: (1) Python's mature runtime optimizations in CPython 3.12, (2) FastAPI's an efficient async I/O implementation that keeps the CPU from blocking, and (3) the The test burden is light, which makes Python's optimized C perform better. add-ons for working with databases. Rust Axum, on the other hand, used 2.3× more energy even with native compilation, which means that framework-level Optimizations and I/O handling patterns have a big effect on the overall efficiency that goes beyond only the compilation strategy.

Java Spring Boot and Quarkus are examples of JVM-based frameworks that showed the the most latency (106–111 ms) and a lot of energy use (2,911–3,319 µWh). The JVM's JIT compilation warmup time, along with The overhead of garbage collection during execution was a factor in both longer response times and more energy use. Quarkus stood apart by very little energy advantage over Spring Boot in JVM mode, which means that When confined, framework-level optimizations don't offer much value. by the properties of the runtime that are underneath.

Trade-offs Between Startup Time and Runtime Performance

The 20.540s starting time of Java Spring Boot is a major problem. for cloud-native deployments that need to grow quickly. Even though Quarkus It cuts startup time down to 6.586 seconds (68% faster), however it's still 6.5 times slower than FastAPI for Python (1.010s). This difference has a direct effect on:

Serverless Economics: Longer cold starts make it more expensive time to run

  • Auto-scaling Efficiency: Starting up late makes it less responsive. to sudden increases in traffic

  • Development Iteration: Feedback loops that take longer during local growth

It's interesting that runtime performance doesn't make up for startup. overhead—Java frameworks had the slowest setup and greatest reaction latencies, which point to systemic problems instead of than trade-offs between startup and runtime.

Recommendations and Insights for Each Framework

Python FastAPI is very efficient because to uvicorn ASGI. event loop, Cython validators from Pydantic, and a link to SQLAlchemy pooling. Recommended for: Deployments that are cloud-native, applications that are important for sustainability (85% less pollution than C# .NET), APIs that do a lot of CRUD.

Go Gin has a moderate amount of energy (934 µWh) and starts up quickly. (1.00s). Recommended for: Microservices designs that need predictable use of resources and deployments that happen often.

Rust Axum has more energy (1,689 µWh), which means it can run async (Tokio). and ways to make the database driver (SQLx) work better.

JVM frameworks are not very efficient as a whole: Java Spring Boot (20.54s to start up, 3,319 µWh), Quarkus (6.59s to start up, 2,911 µWh). Think about for: Legacy enterprise ecosystems that already have JVM infrastructure, but want to move to GraalVM native images for better long-term support.

C# .NET doesn't do well on any of the criteria (4,879 µWh, highest power). CLR garbage collector and framework abstractions are available problems with sustainability.

For Legacy Business Ecosystems

Recommendation: Java Quarkus over Spring Boot

Companies that can only use JVM environments should switch from Spring Boot to Quarkus to start up 68% faster. But runtime Energy use stays about the same, which suggests JVM-level To make things better, you need to do things like optimize (GraalVM native image, G1GC tuning). come close to the efficiency of solutions that aren't JVM.

Not Recommended: C#.NET

Unless some ecosystem requirements require .NET, there are other options. Frameworks are more energy-efficient (2–6 times better) and performance (response times that are 2 to 18 times faster).

Limits

The workload is mostly on CRUD activities and results that are particular to hardware (Intel Core i7-1270P; RAPL only assesses the power of the CPU package; framework results that depend on the version.

Conclusion and Future Work

This study assessed Software Carbon Intensity across seven backend systems. frameworks, running 105 CRUD tests with PostgreSQL. The most important results are: (1) Python FastAPI used the least energy (730 µWh) and had the quickest response time. (4.14ms) and very low emissions (0.347 gCO2eq), which is better than compiled alternatives; (2) JVM frameworks demonstrated structural inefficiencies with delayed starts (6.586–20.540 seconds) and high energy (2.911–3.319 µWh); (3) 6.7× energy variance shows that there is an 85% chance of reducing carbon through framework choices; (4) 20× difference in startup time makes JVM not good for cloud-native installations without GraalVM native making a copy.

What This Means and What Comes Next

For people who work in the field, think about the operational carbon footprint coupled with speed of development when choosing technological stacks. Profile production apps to find places where energy is high. Use a hybrid methods that use energy-efficient languages for tasks that need a lot of computing power microservices while keeping useful languages for less important tasks parts.

Future study could investigate: (1) more extensive workload categories (batch processing, streaming, ML inference), (2) a cloud that can be used by more than one person environments that can grow and shrink as needed, (3) keeping track of how frameworks change over time between versions, (4) the effect of coding styles on energy use, (5) carbon-aware workload scheduling methods, and (6) IDE plugins provide feedback on energy use in real time.

Final Thoughts

As digital revolution speeds up and action on climate change becomes Green Software Engineering must evolve from a niche to a more pressing issue. concern to a basic rule of making software. This Studies show that the choice of technology has a big effect on sustainability of the environment, and that choices based on facts can significantly lower the carbon footprint of business processes.

By making our whole dataset, measuring tools, and analytic scripts We think that by making this information available to the public, we will encourage more research and give people more control. to help professionals make smart, long-lasting technology choices.

Thanks {#thanks.unnumbered}

The authors are thankful for the financial help they got from STMIK Tazkia for sponsoring the publishing, and Jejakin.com, a climate technology company, for offering research facilities, computing power, knowledge of resources and how to calculate carbon. We are grateful to the open-source the people who work on Spring, Quarkus, Gin, Axum, FastAPI, Express, and Frameworks for .NET.