Research on ways to generate program code for REST API applications

Content
- Overview of methods
- Using a ready-made LLM via the API
- Creating your own model with training
- Using standard approaches
- Comparison of methods
- The hybrid method
- Conclusion
Modern software development is characterized by increasing demands on the speed of creation and quality of software products. REST API applications, which are the basis of most web services and microservice architectures, require a significant amount of development time.
Automatic generation of software code based on natural language descriptions is a promising area that can radically change approaches to software development. The emergence of large language models (LLM) and the development of machine learning technologies have opened up new opportunities for solving programming automation problems.
This article analyzes various approaches to generating program code based on the user's text description. In particular, the following methods are considered:
- using a ready-made LLM via the API;
- creating your own model with training;
- the use of standard generation algorithms without the use of neural network technologies.
1. Overview of methods
1.1 Using ready-made LLM via API
This method is based on the use of ready-made large language models via API providers. The following solutions can be distinguished from promising options:
- OpenAI (GPT);
- Yandex AI (YandexGPT);
- Sber AI (GigaChat).
OpenAI presented a family of GPT-4.1 models specifically optimized for programming tasks. The models show significant improvement in the SWE-bench Verified benchmark, with performance improvements of 21.4% compared to GPT-4o and 26.6% compared to GPT-4.5, making them leaders in programming.
Key Features:
- updated knowledge base with code until June 2024;
- The context window of up to 1 million tokens is the largest among the solutions under consideration.;
- Multimodal capabilities (text + images);
- Structured output (JSON) support;
- Integration with external tools via function calling;
- Specialized optimization for programming tasks.
Yandex has already released the fifth generation of its language models with two variants: YandexGPT 5 Pro for complex tasks and YandexGPT Lite for quick answers. The 5 Pro model performs better than the previous generation in 67% of cases, especially in understanding complex instructions and working with external sources.
Key Features:
- specialized classifiers;
- the possibility of fine-tuning the model (Preview);
- working with external APIs through function calls;
- support for streaming generation;
- compliance with 152-FZ requirements and ISO standards;
- corporate support.
Sber has an updated version of GigaChat 2.0, which, according to the MERA benchmark, surpasses the international models GPT-4o, DeepSeek-V3, LLaMA 70B and Qwen2.5 in terms of metrics for the Russian language and ranks first.
Key Features:
- high performance in Russian;
- access to real-time data from the Internet;
- working with text and images;
- music and vocal generation;
- local data storage in the Russian Federation;
- compliance with the requirements of 152-FZ.
Using one of these solutions eliminates the need to develop your own, which in turn saves time and resources. This approach can contribute to faster development of the final software product, but it makes it dependent on external factors and deprives flexibility in implementation. You also need to be aware of the fact that using an external resource forces you to transfer user requests to a third party, and this may not be acceptable in all cases. The following table was compiled to compare the solutions:
Characteristic | OpenAI GPT-4.1 | YandexGPT 5 Pro | GigaChat 2.0 MAX |
The size of the context | Up to 1,000,000 tokens | Up to 32,000 tokens | Up to 128,000 tokens |
Language support | Multilingual (strong for English) | Multilingual (strong for Russian) | Russian + others |
Human Eval | 90.2 (GPT-4o) | 85.5 | 87.2 |
Function Calling | Yes | Yes | Yes (Preview) |
Fine-tuning | Limited | Yes (Preview) | No |
Availability in Russia | No | Yes | Yes |
Local deployment | No | Yes | Yes |
Compliance with the requirements | GDPR, SOC2 | 152-FZ, ISO | 152-FZ |
OpenAI can stand out with a better result of the GPT-4o model in the Human Eval test. However, their API does not support working with a large list of regions, which includes the Russian Federation, and this imposes even greater difficulties for the development and subsequent operation of the entire project. They also lack the ability to deploy the model locally.
When working with their API, Yandex and Sber comply with the law on personal data and, in addition, they can offer to deploy their models inside your company, if necessary. Their models specialize in working with the Russian language, which should contribute to a better understanding of the context of a request from a Russian-speaking user.
Given the specifics of working with LLM from third-party development companies, we can conclude that you need to resort to using the most reliable option. In modern realities, it is advisable to give preference to domestic software, which are YandexGPT and GigaChat. It is important to note that performance in code generation tasks can vary significantly depending on the programming language, the complexity of the task, and the quality of the tools. Practical testing is recommended to make a final decision.
1.2 Creating your own learning model
Developing your own specialized model for generating application code is a complex process that requires time and significant resources. This approach allows you to create a model that is precisely customized to the specific requirements of the project, but it is associated with high risks and uncertainty of the result. The process of creating your own model includes the following key steps::
1. preparing the dataset;
2. Architecture design;
3. Training and optimization.
Preparing a dataset is by itself the most difficult stage in development, and given the specifics of generating program code, the task becomes much more complicated. The dataset must contain paired data consisting of a text query and the corresponding code. The code examples for training the model should represent a variety of pattern architectures, multiple languages, and different types of operations. For effective training, the contents of the dataset must be large enough and can contain hundreds of thousands of data pairs. It is difficult to name a specific number, since everything depends on many factors, including the overall quality of the dataset, the uniqueness of the pairs in the dataset, and the performance of the architecture of the training model in the field of programming.
Even large technology companies face serious challenges at this stage. To train CodePilot, OpenAI had to hire a team of programmers to write the reference code, which significantly increased the cost of the project.
For code generation, the most efficient architecture is Transformer (Transformer-based), which includes Encoder-Decoder and Decoder-only models.
Encoder-Decoder of the model:
- the encoder processes a text request;
- the decoder generates the corresponding program code;
- advantages: high generation quality, the ability to control the output length;
- disadvantages: high demands on computing resources.
Decoder-only models (GPT-like):
- combining the specification and code into a single sequence;
- using special tokens to separate contexts;
- advantages: simplicity of architecture, the ability to fine-tune existing models;
- disadvantages: the difficulty of controlling the length of the generated code;
- There are also hybrid architectures that can combine the use of an already pre-trained model for understanding natural language, a specialized encoder for processing structured data, and a validation module for verifying the correctness of the generated code.
Finally, the model needs to be trained on the dataset obtained at the first stage. The training can take several days and will require the power of several high-performance graphics accelerators. Graphics accelerator capacities can be rented for the duration of model training and optimization. The training time strongly depends on the size of the dataset, the model's requirements for computing resources, and the training strategy itself.
The following step-by-step strategy can be used to train the model:
1. Pre-training on a common code corpus (GitHub, Stack Overflow);
2. fine-tuning on a specialized dataset (REST API);
3. Human Feedback Reinforcement Learning (RLHF) to improve quality.
As a result, the option of creating your own model will require a considerable amount of time, as well as additional financial investments when collecting a dataset and renting productive servers for training. Most importantly, there is no clear guarantee of a satisfactory result and it may be necessary to repeat all the steps repeatedly until the required generation accuracy is achieved.
1.3 Using standard approaches
Classical approaches to program code generation involve pre-defined algorithms that lead to fairly predictable results. In the context of REST API application generation, the following main solutions can be distinguished::
1. Template programming;
2. Syntactic analysis and DSL (Domain-Specific Language);
3. Ontologies and knowledge bases.
Template programming is a methodology for generating code based on pre-created templates with variable parameters. The main algorithms in this approach are:
The algorithm of parameter substitution:
1. Input data parsing (API schema, configuration);
2. Extraction of generation parameters;
3. Using templates with value substitution;
4. Post-processing of the generated code.
Template composition algorithm:
1. Decomposition of complex structures into basic components;
2. Applying hierarchical templates;
3. Assembling the components into a complete structure;
4. Optimization of the resulting code.
DSL approaches are based on the creation of specialized languages for API description and subsequent translation into executable code.
DSL translation algorithm:
1. Lexical analysis of the DSL specification;
2. Syntactic analysis and construction of an abstract syntax tree (AST);
3. Semantic analysis and validation;
4. Generation of target code based on AST;
5. Optimization and formatting of the result.
DSL types for REST API:
- declarative DSLs - describe "what" should be generated (OpenAPI, RAML);
- imperative DSLs - describe "how" to generate code (Gradle DSL, Maven DSL);
- hybrid DSLs combine declarative and imperative elements;
- The approach of ontologies and knowledge bases is based on the formalization of knowledge about the REST API domain in the form of ontologies and the application of inference rules for code generation.
The algorithm of ontological generation:
1. Loading the domain ontology;
2. Analysis of user requirements;
3. Application of withdrawal rules;
4. Formation of the solution model;
5. Code generation based on the model.
As mentioned earlier, these solutions have predictable output results, which is both a plus and a minus. The code generated by such methods is the easiest to test and its performance is guaranteed. Other important advantages are high generation speed and low resource costs.
However, there are problems here in the form of limited flexibility, complex support, and the quality of the result. It all comes down to templates that need to be created for all possible cases. Not every template can be optimal for a specific user request, and it is difficult to predict all user needs. As a result, the user, in addition to being limited in the ability to set generation parameters, is also likely to get a result that is not optimized for his tasks and does not meet software development standards.
2. Comparison of methods
In the review of code generation methods, their application possibilities were considered and advantages with disadvantages were identified. Based on the results of the review, the following comparative table can be compiled:
Criteria | LLM API | Own model | Traditional methods |
Code Quality | High | Medium-High | Average |
Speed of implementation | Fast | Slow | Average |
Initial costs | Low | Very high | Medium |
Operating expenses | Medium-High | Low | Very low |
Flexibility | High | Very high | Low |
Confidentiality | Low | High | High |
High | Limited | High | Average |
According to the table, it can be concluded that the use of third-party LLMs through the provider's API is optimal in scenarios where the priority is the speed of implementation of the solution and there is a sufficient budget for regular costs. In addition, this method is effective when high-quality generated code is required, and data privacy is of low priority. It is advisable to use this approach to quickly verify the designed concepts in order to get the first results, but it is also suitable for implementation in the final software product.
Creating and training your own model is justified in cases of increased information security requirements. This method becomes economically feasible when the total cost of developing and maintaining your own solution is lower than the operating costs of using external APIs. This approach requires long-term planning and investments in the development of technological expertise, but provides full control over the development and the possibility of continuous improvement of the model.
Classical methods of code generation remain relevant in conditions of a limited budget, when investments in modern AI solutions are not justified by the scale of the tasks. Traditional methods are optimally suited for solving simple, repetitive tasks with clear logic, where high adaptability and creativity in generation are not required. The advantage of this approach is the maximum predictability of the results and the complete determinism of the generation process.
3. The hybrid method
An analysis of the considered approaches shows that each method has its own strengths and limitations. The hybrid approach is the integration of various code generation methods into a single system that dynamically selects the optimal way to solve a problem, depending on its specifics and context.
A hybrid code generation system for REST API applications can consist of several key components.
The hybrid approach makes it possible to compensate for the disadvantages of one method with the advantages of the other. For example, the high cost of the LLM API can be reduced by using template generation for standard components. The system automatically distributes tasks between methods depending on their effectiveness for specific types of operations, which leads to optimal use of computing resources and budget. The ability to roll back to alternative generation methods increases the overall reliability of the system. If one method is unavailable or gives unsatisfactory results, the system may switch to another.
The hybrid method is a more advanced solution in the field of automatic code generation, combining the best aspects of various methods. Proper implementation of such a system can significantly increase the efficiency of REST API application development, reduce costs and improve the quality of the resulting code.
Conclusion
Each of the considered methods of code generation has its advantages and applications.:
- LLM API solutions demonstrate the highest quality of generated code and maximum flexibility, but require constant operational costs and create dependence on external providers.;
- Proprietary models provide full control over the generation process and long-term cost-effectiveness, but require significant initial investments and high technical expertise.;
- Traditional methods guarantee predictable results and minimal costs, but are limited in flexibility and adaptability.
Key factors influencing the choice of approach:
- the scale and complexity of the project;
- budget constraints and funding model;
- information security requirements;
- availability of technical expertise;
- time frame of implementation.
Automatic generation of program code for REST API applications is on the verge of wide practical application. The development of artificial intelligence technologies and the emergence of effective language models create the prerequisites for revolutionary changes in software development processes.
Successful implementation of automatic code generation systems requires a balanced approach to the choice of technologies, taking into account the specifics of the project and strategic planning for the development of the technological expertise of the organization. Hybrid solutions combining the advantages of different approaches represent the most promising area for practical application.
Sources
1. https://openai.com/index/gpt-4-1/
2. https://ya.ru/ai/gpt
3. https://sber.pro/publication/sber-predstavil-obnovlyonnuyu-neiroset-gigachat-20/
4. https://textcortex.com/post/gpt-4o-review
5. https://habr.com/ru/companies/bothub/articles/893128/
6. https://habr.com/ru/companies/sberdevices/articles/890552/
7. Nijkamp, E., Pang, B., Hayashi, H., Tu, L., Wang, H., Zhou, Y., ... & Xiong, C. (2022). CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis. arXiv preprint arXiv:2203.13474.
8. https://systems-analysis.ru/wiki/ARCHITECTURESLLM