In the global push to combat climate change and accelerate energy transition, accurately assessing greenhouse gas emissions and developing effective mitigation strategies have become central tasks in the energy sector. However, the oil and gas industry, as a critical part of the global energy system, faces significant challenges in obtaining key data. Information is highly fragmented, scattered across costly proprietary databases, and often presented in inconsistent formats. Traditional manual extraction methods are not only inefficient but also prone to error, while official data updates lag behind, failing to meet the needs of real-time decision-making.
Harnessing Large Language Models: A Leap in Efficiency and Cost Reduction
To address these long-standing data challenges, Stanford University Ph.D. student Zhenlin Chen and his research team have made a significant breakthrough. Leveraging cutting-edge large language models (LLMs), they developed an innovative framework that opens a new pathway for data acquisition in the energy sector. By tapping into the powerful text comprehension capabilities of models like GPT-4 and GPT-4o, the framework can efficiently and accurately extract key data from a wide array of publicly available documents related to the oil and gas industry. These include academic journal papers, news articles, and other formats, greatly expanding the breadth and depth of data sources.
The framework demonstrates two core advantages. In terms of cost-effectiveness, through careful optimization of GPT-4o applications, the team achieved a dramatic reduction in the cost of extracting a single data point—down to just $0.04, nearly 10 times lower than traditional methods. This is a breakthrough for organizations constrained by the high cost of data acquisition. In terms of accuracy and efficiency, the model also performed impressively, achieving an accuracy rate of 83.74% and an F1 score of 78.16% on the test dataset, proving its robust adaptability to complex, multi-source information.
Expert-Labeled Datasets Drive Accuracy in Real-World Applications

To validate the effectiveness of this innovative framework, the team conducted a series of rigorous studies. They constructed a specialized dataset containing 108 documents, comprehensively covering 51 key parameters critical to the energy industry, such as gas-oil ratio and water-oil ratio. To ensure data accuracy and reliability, domain experts were invited to manually annotate the data, creating a high-quality benchmark to support model training and optimization.
Chen explained, “We deeply integrated domain-specific knowledge with numerical computation techniques, including the use of physical and thermodynamic equations. Then, through multiple rounds of cross-checking, we iteratively compared expert-calculated results with model outputs to refine performance.”
During the research process, the team further fine-tuned the model. They closely examined errors and inconsistencies between extracted data and human-labeled results. A detailed error analysis revealed two main sources of deviation: firstly, human annotation is not infallible and may contain mistakes—sometimes the model’s interpretation was even more accurate; secondly, the model occasionally struggled with unit conversion and complex numerical operations. In response, the team intensified training and corrections, boosting the model’s accuracy from an initial 63.6% to 83.74%.
In terms of processing efficiency, the new framework showed remarkable advantages. It extracted large volumes of data from 32 documents in just 61.41 minutes—an average of only 7.09 seconds per document—achieving a substantial leap over manual methods. The study also found that text type significantly affects extraction efficiency. News articles, with their simpler structure and straightforward language, are processed much faster than complex technical literature. According to Chen, one of the biggest challenges was developing optimal zero-shot learning strategies, which required continuous prompt engineering and iteration. Ultimately, the team succeeded in establishing a systematic prompt optimization methodology, unlocking new potential for LLM applications in the energy domain.
Cross-Model Collaboration and Transfer Learning: Scaling Across the Energy Supply Chain
Notably, the framework is highly versatile. It excels not only in upstream applications of the energy industry but also adapts well to data extraction tasks in midstream and downstream segments. For instance, in the power generation sector, it can efficiently process annual power plant reports issued by regulatory agencies, as well as periodic energy statistics published by governments. These reports include data from traditional oil and gas operations, downstream energy use, data center performance, and various structured data formats.
Chen emphasized: “We aim to build a framework with strong transfer learning capabilities, allowing it to flexibly adapt to different application scenarios and continuously generate value through a ‘learning-by-analogy’ mechanism.”
Looking ahead, the researchers have a clear roadmap. Their next goal is to further enhance accuracy and optimize the system architecture. “In the early stages of this project, we primarily relied on GPT models,” Chen explained. “But with the rapid advancement of LLM technologies and the emergence of more diverse models, we plan to integrate multiple models—such as DeepSeek—for collaborative document reading and cross-validation, thereby increasing model reliability.”
Currently, Dr. Wen’nan Long, a team member, has completed a comprehensive study on global LNG (liquefied natural gas) carbon emissions, meticulously tracking the carbon footprint across the entire supply chain from upstream extraction to downstream application. The related paper is now under review. The team also intends to use systematic error analysis to further investigate why models struggle with certain types of content. “By analyzing large volumes of error samples,” Chen noted, “we aim to pinpoint the model’s blind spots and common failure modes, which will directly inform future model improvements.”
On the application side, the research team has already completed qualitative analysis for the upstream sector and will now expand into comprehensive assessments of the midstream and downstream sectors. “We hope this study will serve as a landmark achievement in the integration of AI and the energy sector,” said Chen, “providing essential data support for the scientific formulation of global climate policies.”
The research, titled “Advancing oil and gas emissions assessment through large language model data extraction,” was recently published in the journal Energy and AI, drawing widespread attention from both the energy and artificial intelligence communities. This innovative framework is poised to inject new momentum into data acquisition and analysis in the energy sector, paving the way for more sustainable development worldwide.