A New Approach to Web Application Security: Utilizing GPT Language Models for Source Code Inspection
Abstract
:1. Introduction
1.1. AI in Software Development
1.2. The Challenges of Contemporary Web Frameworks
1.3. Sensitivity and Potential Vulnerabilities
- Definition of a proprietary classification to quantify the sensitivity and immunity of the data and evaluating it on a test set of 200 variable names using the GPT API. This classification provides a robust framework for comprehending and classifying data according to its sensitivity. By evaluating it using the GPT API, we ensure that the model recognizes and processes diverse data categories appropriately, thereby making our approach more applicable to real-world scenarios.
- Demonstration of prompts generated using prompt engineering for the GPT API for static analysis of the code of a complex web application, taking the larger context and deeper context into account. We maximize the potential of the GPT models through prompt engineering, ensuring that the static analysis is not only accurate but also context-aware. This method bridges the divide between generic AI analysis and specific software vulnerabilities, yielding both broad and profound insights.
- Evaluation of the efficacy of the GPT-3.5 and GPT-4 models in detecting sensitive data in an application via static code analysis. By comparing the performance of various variants of the GPT models, we seek to discern the evolution and enhancements in these models over time. This evaluation sheds light on how advancements in LLMs can further strengthen software application security.
- Evaluation of the ability of the GPT-3.5 and GPT-4 models to determine the protection levels of front-end application elements. In addition to detecting sensitive data, it is essential to ensure that it is adequately protected. By assessing the GPT models’ ability to estimate protection levels, we can better comprehend their potential role in security assessments and potential interventions.
- Evaluation of the effectiveness of the GPT-3.5 and GPT-4 models in detecting when sensitive data handling in a web application is not adequately isolated and protected, thus producing a possible vulnerability. This evaluation is essential to comprehending the practical applicability of LLMs in cybersecurity. If these models can reliably identify data management vulnerabilities, they could become indispensable for application developers seeking to fortify their applications against intrusions.
2. Related Works
2.1. The Impact of Artificial Intelligence on Software Development
2.2. The Code Interpretation Capabilities of GPT
2.3. Vulnerability Detection with Large Language Models
3. Methodology
3.1. Categorization of Sensitivity and Security
- Level 1 (Low Sensitivity): The accumulation and compilation of large quantities of data at this level is required to infer confidential information and create abuse opportunities. Sensitive information includes, among other things, a user’s behavior history, a list of websites visited, products and topics of interest on a website, and search history.
- Level 2 (Medium Sensitivity): By obtaining data at this level, it is possible to retrieve and compile potentially exploitable information. Solutions such as two-factor authentication, regular email and SMS notifications, and exhaustive logging in to affected applications can mitigate the effects of a potential breach. This is the largest and most extensive group. It includes data such as usernames, passwords, private records, political views, sexual orientation, IP address, physical location, and hectic schedules.
- Level 3 (High Sensitivity): The data at this level is sensitive in and of itself, and its acquisition or disclosure can have severe legal repercussions and cause extensive harm. Examples include health information, medical history, social security number, driver’s license, and credit card information.
- Protection Level 0: The component has no protection at all.
- Protection Level 1: Bare-bones authentication: the application checks whether the user trying to access the resource is logged in to the application.
- Protection Level 2—RBAC [36]: In addition to being logged in, the user has different roles that define their scope of privileges within the application.
- Protection Level 3—ABAC [37]: In addition to the login and possible role scopes, other attributes such as time, physical location, or type/ID of device used are checked before granting access.
- In our experiments, we attempted to identify instances in which the AI-based analysis of the source code produced pairings in which these two values did not match, i.e., in accordance with the definition of CWE-653 failure, a service dealing with sensitive data could be accessed and called by a component with a protection level lower than the sensitivity level of the data due to a lack of proper isolation.
3.2. Evaluation Dataset
3.3. GPT API and Evaluation
3.4. GPT-Driven Detection Pipeline
- Minifying Code Base. Input: The raw project folder and files; Output: A single .txt file per project, containing the relevant source files without line breaks and whitespace characters; Prompt Appendix: -; Supplementary Material: S1; Description: A transformation step to prepare the project files to be included in the prompts.
- Sensitive Element Detection. Input: The minified .txt file of the project; Output: A CSV file with the sensitive elements from the projects and their sensitivity levels; Prompt Appendix: Appendix A.1; Supplementary Material: S2; Description: The analysis first identifies the sensitive data, its sensitivity level, explains how it appears in the application, how it is used, and then an aggregation script uses this information to select the services that were involved in read or write operations on sensitive data in the project.
- JSON Mapping of Project Files. Input: The minified .txt file of the project; Output: An array of JSON objects following the template of Appendix A.2; Prompt Appendix: Appendix A.3; Supplementary Material: S3 contains the script, S3_1 contains the prompts; Description: A JSON file is created for each project, in which the components of the application are reduced to JSON objects, containing only the most necessary information.
- Protection Level Discovery. Input: The array of JSON objects mapped from the project; Output: The extended JSON list, containing the protection levels; Prompt Appendix: Appendix A.4; Supplementary Material: S3 contains the script, S3_1 contains the prompts; Description: The JSONs and the relevant routing and guard configurations are passed one by one to the GPT API, which adds the protection level corresponding to our scale.
- Vulnerability Detection. Input: The CSV file from Step 2. and the expanded JSON file from Step 4.; Output: An Excel file containing the sensitivity levels, protection levels, and detected vulnerabilities for each project; Prompt Appendix: -; Supplementary Material: S4_1–S_2 contain the scripts for the various Excel file generations, S4_4 is the vulnerability detector, but it needs to have the results of the previous S4_X scripts; Description: By aggregating the results, the vulnerabilities from the tested projects are detected.
3.4.1. JSON Mapping of Project Files and Protection Level Discovery
3.4.2. Sensitive Element Detection
3.4.3. Vulnerability Detection
4. Results and Discussion
- RQ1: How effectively can we identify sensitive data and services based on context?
- RQ2: How effectively can we detect component protection levels?
- RQ3: How accurately can we identify vulnerabilities belonging to the CWE-653 category?
4.1. RQ1
- Injection Circumvention: Although Angular formally documents the exact role and intended use of services, a recurring problem in debugging has been that developers have circumvented the development pattern of injecting and then invoking methods. There were cases where services saved sensitive data in localStorage and the components of the application later read it from there, and one case where services used EventEmitters to pass sensitive values. This resulted in missing information cases in which the presence of services was detected but not their precise identities because they were not used in a conventional manner. Detecting and handling such cases is not easy. The prompt of the step to detect sensitive data and services will be extended in the later evaluations with an additional element to tag sensitive data from the EventEmitter, but because of the similarities (subscribe operation), they can easily be confused with observables, and thus, when detecting such cases, it is necessary to analyze the service code to see if the transmitted event is from inside or outside the application and if sensitive data are transmitted as a payload.
- Monolith Service: The most common issue was the lack of isolation mentioned above, a textbook example of the CWE-653 vulnerability, where services are not organized around a resource but around a type of operation. For example, there is an entity called DbService, which is responsible for the database operations of registration, login, shopping, commenting, and private messaging at the same time. This flaw led to very high insufficient detection count values, where the service was identified either as sensitive or not, depending on which of the many resources it was currently managing. In the future, this can be resolved so that at the end of the detection, each service is assigned the highest sensitivity value that was encountered during the detection.
- Delegated Responsibility: This issue refers to instances in which a service was initially used correctly, but then the data it handles was stored and transmitted in the application differently, such as by passing parameters between parent and child components or by creating a separate component or other object that acted as a session database. We expect that this will only be fully resolved if we can further develop our methodology as we plan and track accurate navigations and calls between application components. In the meantime, since the sensitivity detection prompt also tries to interpret what happens to sensitive data after it is first retrieved from the service, in the next round of analyses we will tag those that were derived from a sensitive service and, at the end of the detection, we will attempt to recover sensitivity levels based on these tags for data retrieved from local storages as well.
4.2. RQ2
4.3. RQ3
4.4. Future Work
5. Threats to Validity
6. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
Application programming interface | API |
Attribute-based access control | ABAC |
Automatic program repair | APR |
Common weakness enumeration | CWE |
Data access object | DAO |
Generative pre-trained transformer | GPT |
Large language model | LLM |
Reinforcement learning from human feedback | RLHF |
Role-based access control | RBAC |
Appendix A. Prompts Used in the Analysis Pipeline
Appendix A.1. The Prompt for Detecting Sensitive Elements in the Component Code
- ing“‘{text}’”
- In your answer, use this CSV format:
- Element; Type; Level; Origin; ServiceName; Function; Goal; GoalName; Classname
- Where the Element is the name of the sensitive element, the Type is its type based on the list above i.e., username, password, email address, etc., the reason why this element is identified as sensitive. The Level is a numeric value from 1–3, signifying how sensitive this information is. A value of 1 means low severity, such as browser history, user behavior, cookies, or the identifiers of such information; 2 means medium severity such as username, email, password, birthdate, IP address, location data, political/religious views, sexual orientation, etc.; 3 means high severity, information that might lead to identity theft, financial loss, or significant privacy invasion such as social security number, health insurance, drivers license, legal documents, passport, medical information, credit card data, encryption keys, tax IDs. The Origin is one of the following values: (‘UserInput’, ‘LocalStorage’, ‘Service’) The ServiceName is the name of the service from where the element came if the value of the Origin is ‘service’, else its an empty string. The Function is a description of what the given element is used for. The Goal is where the element is forwarded, stored, what happens to it, it must be one of the following values: (‘read’, ‘stored’, ‘service’). The GoalName is the name of the service the element is forwarded to if the value of the Goal field was ‘service’. The Classname is the name of the Angular class where the element was detected. For example:
- “‘
- customersArrayData; CustomerInformation;2; Service; DbServiceService;Array of Customer objects containing sensitive information about the customers; Read; ; CustomersComponent
- id; User Identifier; 1; UserInput; Id to query a Customer object from the database;Used to call the GetCustomer function of the DbServiceService; Service, DbServiceService; CustomersComponent
- ’”
- Your answer must include only the CSV format and nothing else! If there are no sensitive elements in the file, your answer must be exactly ’There are no sensitive elements in the file!’ The element and variable names in the files might not be English! Even if parts of the source code is in Spanish, German, Italian, Hungarian, or other languages, etc., you must correctly identify the sensitive elements!
- import{Component}from’@angular/core’;@Component({selector:’app-root’, templateUrl:
- ’./app.component.html’,styleUrls: [’./app.component.css’]})exportclassAppComponent{}
Appendix A.2. The Definition of the Mapping JSON, to Be Used by Multiple Prompts
- “selector”: if the type of the analyzed file is a component, this field should contain its selector string from its class decorator. If the file is not a component, the value should be an empty string.
- “name”: name of the class in the analyzed file, for example AppComponent, LoginService, etc.
- “file”: the name of the analyzed file. If you received multiple files for a component—the .ts and the .html—this should be the name of the .ts file.
- “path”: the path of the component—if there is any—from the navigation modules. In this step, the value of this should remain an empty list.
- “guarded”: if the class is a component, and it is protected by an AuthGuard, the value should be 1, the default value is 0.
- “guardlevel”: the value of this should be 0.
- “guards”: the names of the guard classes that protect the class in the analyzed file. The default value is an empty list []. If the analyzed class is not an Angular component, it must remain an empty list.
- “navigations”: A list of possible navigations from the file by class names. If you cannot find out the class name of the component targeted by the navigation, you should include its path. An optimal value of this field would be: [‘LoginComponent’, ‘DashboardComponent’]; an acceptable value would be [‘/login’, ‘/dashboard’]. If there are no possible navigations from the analyzed class, the list should be empty.
- You also need to take into account navigations that can possibly happen via a service, such as this.superService.navigate(‘home’) or this.tereloService.menj(‘/profile’)If the analyzed file was an Angular module, the value should be an empty list ([])!
- “injections”: A dict, where the keys are the names of the services and other injectables that were injected into the analyzed class via the constructor. The values of the keys are a dict, containing and organizing the calls when said injectable was used in the file. The keys of this inner dict are asynchrons, functions, and variables. Asynchron refers to promises and observables used by the analyzed file, functions to the names of functions called by the analyzed file, variables to the various variables of the injected accessed directly by the analyzed file. The values of this inner dict are lists which are either empty, if that type of interaction with the injectable did not happen, or the names of said interaction. An example value for this field would be: {‘LoginService’: ‘Functions’: [‘login’, ‘forgotPassword’], ‘Variables’: [], ‘Asynchrons: [‘currentUser’]’, ‘LoggerService: ‘Functions’: [‘logInfo’, ‘logError’], ‘Variables’: [‘log’], ‘Asynchrons’: []’} If there are no injectables in the file, the value should be an empty dict!
- “parents”: If the analyzed file is a component and it has parent components, this should be a list that contains the names of those components, like [‘ProfileComponent’] or [‘LoginComponent’, ‘RegisterComponent’]. The default value is an empty list.
- “children”: If the type of the analyzed file is a component, this list should contain the names—or if the names cannot be determined from the context, then the selector strings of every child component that are contained by the .html template of the component. For example: [‘SpinnerComponent’, ‘alert-component’]. If the type of the file is not component or it does not have any children, then the value of this field should be an empty list. The router outlet should also be considered to be a child, so if the .html template contains it, it should be included here!
Appendix A.3. The Mapping Prompt That Converts the Minified TypeScript Files into JSONs
- {firstResultFormat}
- A short example of a correct output:
- {firstResultFormatExample}
- The file to be analyzed is the following:
- “‘
- {section}
- ’”
- Your answer should only contain the JSON and nothing else, as your answer will be instantly parsed as a JSON by the receiver! Use double quotation marks () and no single quotation mark () under any circumstance!
- { “type”: “Component”,
- “selector”: “profile-comp”,
- “name”: “ProfileComponent”,
- “file”: “profile.component.ts”,
- “path”: [“/profile”],
- “guarded”: 0,
- “guards”: [],
- “guardlevel”: 0,
- “navigations”: [‘/dashboard’],
- “injections”: {‘OAuthService’: {‘Functions’: [‘checkAuth’], ‘Variables’: [],
- ’Asynchrons’: [‘token’]}, ‘LoggerService’: {‘Function’: [‘logInfo’, ‘logError’],
- ’Variable’: [‘log’], ‘Asynchron’: []}},
- “parents”: [],
- “children”: [] }
- during the evaluations.
- The value of section is a minified .ts source file such as:
- import{Component,OnInit}from’@angular/core’;@Component({selector:’app-
- about’,templateUrl:’./about.component.html’,styleUrls:[‘./about.component.scss’]})
- exportclassAboutComponentimplementsOnInit{constructor(){}ngOnInit(){}}
Appendix A.4. The Access Control Evaluation Prompt
- {firstResultFormat} With this JSON, you will also receive minified versions of the Angular files, that will contain information concerning the routing and the AuthGuard configurations of the web application. Based on these files you will need to extend the JSON representing the component with the following steps.
- 1. Modify the path value if their ‘path’ is an empty list. If there is no path defined in any of the route[] arrays in the received files that contain a path value for that particular component, then the value of its path should remain an empty list in the JSON. There is a good chance for example, that components that are children to another component should not have any paths, because they are most likely to be accessible through their parent component. For example: If there is {{ path: ‘register’, component: RegisterComponent }} in one of the routing modules Then, in the JSON, where type: ‘component’ and name: ‘RegisterComponent’, ‘path’ should be [‘/register’] Make sure that even if there is already a value in the path field, it has a unified format, such as ‘/register’, ‘/home’, ‘/profile/:id’ The reason why the path is a list and not a string is because you also need to include the default/home path to the appropriate JSON (‘/’) and the wildcard path (‘**’) to the appropriate JSON if they exist in the route definitions. The paths might be also stored in multiple files if there is a module–submodule relationship. For example, if one routing module contains the following:{{path: ‘users/:id’,loadChildren: () => import(“./users-module/users.module”).then(m => m.UsersModule)}}, then the UsersModule imports another routing module, for example, a UsersRoutingModule, where there is a line such as:{{path: ‘addition’, component: AddUserComponent}},then the path for the AddUserComponent is ‘/users/:id/adddition’ if there is{{path: “, redirectTo: ‘profile’}}in that module, then the component with the ‘/users/:id/profile’ path should also get the ‘/users/:id/’ route in their paths array. Do not under any circumstance mix the paths and the possible navigations from that given component! - only components have paths! - most components only have one path. If the component is used as the child component of another, it is very likely that it does not have a path in the routing information, therefore, its path list should remain empty. - the only exceptions are usually the default routes: ”, ‘/’ and the wildcard route: ‘**’
- 2. List every ‘guard’ by their class names in the ‘guards’ list of the JSON that are protecting the component based on the canActivate parts of the routing information. For example:if there is a configuration like this in one of the module files {{path: ‘profile’, component: ProfileComponent, canActivate: [AuthguardGuard]}} then in the JSON, where ‘type’: ‘component’ and ‘name’: ‘ProfileComponent’, ‘guarded’ should be set to 1, and the ‘guards’ should be: [‘AuthguardGuard’] Again, take into account the module level imports: if there is a {{path: ‘users/:id’, canActivate: [RoleGuard], loadChildren: () => import(“./users-module/users.module”).then(m => m.UsersModule)}} then every route/component that is imported by the UsersModule already has the RoleGuard, in addition to any other they might have individually! For example, a {{path: ‘addition’, component: AddUserComponent}} in the UsersModule’s imported routes would give the AddUserComponent’s JSON the [‘RoleGuard’] as its guard, while a {{path: ‘deletion’, component: DeletionUserComponent, canActivate: [AdminGuard]}} would give the DeletionUserComponent’s JSON the [‘RoleGuard’, ‘AdminGuard’] list as its ‘guards’.
- 3. As a final step, you need to calculate the guardlevel if the JSON has ‘component’ as its type, based on the following rules:- if the ‘guarded’ value is 0 and their ‘guards’ list is still empty after the previous steps, the ‘guardlevel’ is 0 - if it has at least one guard in their ‘guards’ list, but based on the implementation of it, that guard only checks whether the user is logged in in or not, the ‘guardlevel’ should be 1. - if there is implementation of guard or guards in the ‘guards’ list, they not only check whether they are logged in, but they also check for access level—admin or normal user; patient, relative, nurse or practitioner; buyer or seller—the ‘guardlevel’ should be 2. - if the implementation of the guards contains everything necessary for guardlevel 2, but go beyond them and check for extra information, for example, location, IP address, timezone, device/browser type, the guardlevel should be 3.Your answer should contain the full version of the modified JSON, it will have to include every object which was in the input in the same format and contain every modification you have made to it.
- This is the input JSON file, separated by three backticks:
- “‘
- {firstResult}
- ’”
- And these are the minified files containing the routing and guard information, also separated by three backticks:
- “‘
- {filteredGuards}
- ’”
- The value of filteredGuards is the minified, concatenated code of the AuthGuards and routing modules from a project such as:
- import{NgModule}from‘@angular/core’;import{PreloadAllModules,RouterModule,
- Routes}from‘@angular/router’;import{AuthGuard}from‘./auth-module/auth.guard’
- ;import{MainMenuComponent}from‘./components/main-menu/main-menu.component’;
- import{NotFoundComponent}from‘./components/not-found/not-found.component’;
- import{ReclutamientoComponent}from‘./components/reclutamiento/reclutamiento
- .component’;import{RoleGuard}from‘./guards/role.guard.ts’;const routes:Routes=
- {path:”,redirectTo:‘inicio’,pathMatch:‘full’},{path:‘inicio’,component:MainMenuCompo
- nent},{path:‘reclutamiento’,component:ReclutamientoComponent,canActivate:roleGuard}
- ,{path:‘404’,component:NotFoundComponent};@NgModule({imports:RouterModule.for
- Root(routes,{preloadingStrategy:PreloadAllModules,scrollPositionRestoration:‘enabled’})
- exports:RouterModule})export class AppRoutingModule{}import{Injectable}from‘@angular
- /core’;import{CanActivate,ActivatedRouteSnapshot,RouterStateSnapshot,UrlTree,Router}
- from‘@angular/router’;import{Observable}from‘rxjs’;@Injectable({providedIn:‘root’})export class PlanningGuard implements CanActivate{constructor(private router:Router){}canActi
- vate(next:ActivatedRouteSnapshot,state:RouterStateSnapshot):Observable<boolean
- UrlTree> Promise<boolean UrlTree>,boolean UrlTree{if(this.router.getCurrentNavigation()
- .extras.state)return true;this.router.navigate(‘planning’);return false;}}
References
- Introduction to the Angular Docs. Available online: https://angular.io/docs (accessed on 20 July 2023).
- Sanderson, K. GPT-4 is here: What scientists think. Nature 2023, 615, 773. [Google Scholar] [CrossRef]
- Deng, J.; Lin, Y. The Benefits and Challenges of ChatGPT: An Overview. Front. Comput. Intell. Syst. 2023, 2, 81–83. [Google Scholar] [CrossRef]
- Jánki, Z.R.; Bilicki, V. Rule-Based Architectural Design Pattern Recognition with GPT Models. Electronics 2023, 12, 3364. [Google Scholar] [CrossRef]
- Hourani, H.; Hammad, A.; Lafi, M. The Impact of Artificial Intelligence on Software Testing. In Proceedings of the 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT), Amman, Jordan, 9–11 April 2019. [Google Scholar] [CrossRef]
- Heydon, A.; Maimone, M.; Tygar, J.; Wing, J.; Zaremski, A. Miro: Visual specification of security. IEEE Trans. Softw. Eng. 1990, 16, 1185–1197. [Google Scholar] [CrossRef]
- Giordano, M.; Polese, G. Visual Computer-Managed Security: A Framework for Developing Access Control in Enterprise Applications. IEEE Softw. 2013, 30, 62–69. [Google Scholar] [CrossRef]
- Hossain Misu, M.R.; Sakib, K. FANTASIA: A Tool for Automatically Identifying Inconsistency in AngularJS MVC Applications. In Proceedings of the Twelfth International Conference on Software Engineering Advances, Athens, Greece, 8–12 October 2017. [Google Scholar]
- Szabó, Z.; Bilicki, V. Access Control of EHR Records in a Heterogeneous Cloud Infrastructure. Acta Cybern. 2021, 25, 485–516. [Google Scholar] [CrossRef]
- Martin, B.; Brown, M.; Paller, A.; Kirby, D.; Christey, S. CWE. SANS Top 2011, 25. [Google Scholar]
- Rainey, S.; McGillivray, K.; Akintoye, S.; Fothergill, T.; Bublitz, C.; Stahl, B. Is the European Data Protection Regulation sufficient to deal with emerging data concerns relating to neurotechnology? J. Law Biosci. 2020, 7, lsaa051. [Google Scholar] [CrossRef]
- Cheng, S.; Zhang, J.; Dong, Y. How to Understand Data Sensitivity? A Systematic Review by Comparing Four Domains. In Proceedings of the 2022 4th International Conference on Big Data Engineering, Beijing, China, 26–28 May 2022. [Google Scholar] [CrossRef]
- Belen Saglam, R.; Nurse, J.R.; Hodges, D. Personal information: Perceptions, types and evolution. J. Inf. Secur. Appl. 2022, 66, 103163. [Google Scholar] [CrossRef]
- Lang, C.; Woo, C.; Sinclair, J. Quantifying data sensitivity. In Proceedings of the Tenth International Conference on Learning Analytics & Knowledge, Frankfurt, Germany, 23–27 March 2020. [Google Scholar] [CrossRef]
- Chua, H.N.; Ooi, J.S.; Herbland, A. The effects of different personal data categories on information privacy concern and disclosure. Comput. Secur. 2021, 110, 102453. [Google Scholar] [CrossRef]
- Rumbold, J.M.; Pierscionek, B.K. What Are Data? A Categorization of the Data Sensitivity Spectrum. Big Data Res. 2018, 12, 49–59. [Google Scholar] [CrossRef]
- Botti-Cebriá, V.; del Val, E.; García-Fornes, A. Automatic Detection of Sensitive Information in Educative Social Networks. In Proceedings of the 13th International Conference on Computational Intelligence in Security for Information Systems (CISIS 2020), Burgos, Spain, 14 May 2020; pp. 184–194. [Google Scholar] [CrossRef]
- Jiang, L.; Liu, H.; Jiang, H. Machine Learning Based Recommendation of Method Names: How Far are We. In Proceedings of the 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), San Diego, CA, USA, 11–15 November 2019. [Google Scholar] [CrossRef]
- Momeni, P.; Wang, Y.; Samavi, R. Machine Learning Model for Smart Contracts Security Analysis. In Proceedings of the 2019 17th International Conference on Privacy, Security and Trust (PST), Fredericton, NB, Canada, 26–28 August 2019. [Google Scholar] [CrossRef]
- Mhawish, M.Y.; Gupta, M. Predicting Code Smells and Analysis of Predictions: Using Machine Learning Techniques and Software Metrics. J. Comput. Sci. Technol. 2020, 35, 1428–1445. [Google Scholar] [CrossRef]
- Cui, J.; Wang, L.; Zhao, X.; Zhang, H. Towards predictive analysis of android vulnerability using statistical codes and machine learning for IoT applications. Comput. Commun. 2020, 155, 125–131. [Google Scholar] [CrossRef]
- Park, S.; Choi, J.Y. Malware Detection in Self-Driving Vehicles Using Machine Learning Algorithms. J. Adv. Transp. 2020, 2020, 3035741. [Google Scholar] [CrossRef]
- Jiang, N.; Lutellier, T.; Tan, L. CURE: Code-Aware Neural Machine Translation for Automatic Program Repair. In Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), Madrid, Spain, 22–30 May 2021; IEEE: Piscataway, NJ, USA, 2021. [Google Scholar] [CrossRef]
- Sharma, T.; Kechagia, M.; Georgiou, S.; Tiwari, R.; Vats, I.; Moazen, H.; Sarro, F. A Survey on Machine Learning Techniques for Source Code Analysis. arXiv 2022, arXiv:2110.09610. [Google Scholar]
- Sarkar, A.; Gordon, A.D.; Negreanu, C.; Poelitz, C.; Ragavan, S.S.; Zorn, B. What is it like to program with artificial intelligence? arXiv 2022, arXiv:2208.06213. [Google Scholar]
- Wei, J.; Tay, Y.; Bommasani, R.; Raffel, C.; Zoph, B.; Borgeaud, S.; Yogatama, D.; Bosma, M.; Zhou, D.; Metzler, D.; et al. Emergent Abilities of Large Language Models. arXiv 2022, arXiv:2206.07682. [Google Scholar]
- Liu, Y.; Han, T.; Ma, S.; Zhang, J.; Yang, Y.; Tian, J.; He, H.; Li, A.; He, M.; Liu, Z.; et al. Summary of ChatGPT/GPT-4 Research and Perspective Towards the Future of Large Language Models. arXiv 2023, arXiv:2304.01852. [Google Scholar]
- Surameery, N.M.S.; Shakor, M.Y. Use Chat GPT to Solve Programming Bugs. Int. J. Inf. Technol. Comput. Eng. 2023, 3, 17–22. [Google Scholar] [CrossRef]
- Borji, A.; Mohammadian, M. Battle of the Wordsmiths: Comparing ChatGPT, GPT-4, Claude, and Bard. SSRN Electron. J. 2023. [Google Scholar] [CrossRef]
- Wu, J. Literature review on vulnerability detection using NLP technology. arXiv 2021, arXiv:2104.11230. [Google Scholar]
- Thapa, C.; Jang, S.I.; Ahmed, M.E.; Camtepe, S.; Pieprzyk, J.; Nepal, S. Transformer-based language models for software vulnerability detection. In Proceedings of the 38th Annual Computer Security Applications Conference, Austin, TX, USA, 5–9 December 2022; pp. 481–496. [Google Scholar] [CrossRef]
- Omar, M. Detecting software vulnerabilities using Language Models. arXiv 2023, arXiv:2302.11773. [Google Scholar]
- Sun, Y.; Wu, D.; Xue, Y.; Liu, H.; Wang, H.; Xu, Z.; Xie, X.; Liu, Y. When GPT Meets Program Analysis: Towards Intelligent Detection of Smart Contract Logic Vulnerabilities in GPTScan. arXiv 2023, arXiv:2308.03314. [Google Scholar]
- Cheshkov, A.; Zadorozhny, P.; Levichev, R. Evaluation of ChatGPT Model for Vulnerability Detection. arXiv 2023, arXiv:2304.07232. [Google Scholar]
- Feng, S.; Chen, C. Prompting Is All You Need: Automated Android Bug Replay with Large Language Models. arXiv 2023, arXiv:2306.01987. [Google Scholar]
- Ferraiolo, D.; Cugini, J.; Kuhn, D.R. Role-based access control (RBAC): Features and motivations. In Proceedings of the 11th Annual Computer Security Application Conference, New Orleans, LA, USA, 11–15 December 1995; pp. 241–248. [Google Scholar]
- Yuan, E.; Tong, J. Attributed based access control (ABAC) for Web services. In Proceedings of the IEEE International Conference on Web Services (ICWS’05), Orlando, FL, USA, 11–15 July 2005. [Google Scholar] [CrossRef]
- Pricing of GPT. Available online: https://openai.com/pricing (accessed on 20 July 2023).
- OpenAI—Privacy Policy. 2023. Available online: https://openai.com/policies/privacy-policy (accessed on 25 September 2023).
- Qiu, R. Editorial: GPT revolutionizing AI applications: Empowering future digital transformation. Digit. Transform. Soc. 2023, 2, 101–103. [Google Scholar] [CrossRef]
- Shoeybi, M.; Patwary, M.; Puri, R.; LeGresley, P.; Casper, J.; Catanzaro, B. Megatron-lm: Training multi-billion parameter language models using model parallelism. arXiv 2019, arXiv:1909.08053. [Google Scholar]
- Ji, Z.; Lee, N.; Frieske, R.; Yu, T.; Su, D.; Xu, Y.; Ishii, E.; Bang, Y.J.; Madotto, A.; Fung, P. Survey of Hallucination in Natural Language Generation. ACM Comput. Surv. 2023, 55, 248. [Google Scholar] [CrossRef]
- Moghaddam, S.R.; Honey, C.J. Boosting Theory-of-Mind Performance in Large Language Models via Prompting. arXiv 2023, arXiv:2304.11490. [Google Scholar]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar] [CrossRef]
- Wang, X.; Wei, J.; Schuurmans, D.; Le, Q.; Chi, E.; Narang, S.; Chowdhery, A.; Zhou, D. Self-Consistency Improves Chain of Thought Reasoning in Language Models. arXiv 2023, arXiv:2203.11171. [Google Scholar]
- What Is the Difference between the GPT-4 Models? Available online: https://help.openai.com/en/articles/7127966-what-is-the-difference-between-the-gpt-4-models (accessed on 20 July 2023).
- Martin, R.C. Getting a SOLID Start. Robert C Martin-objectmentor.com. 2013. Available online: https://sites.google.com/site/unclebobconsultingllc/getting-a-solid-start (accessed on 26 September 2023).
- Kokrehel, G.; Bilicki, V. The impact of the software architecture on the developer productivity. Pollack Period. 2022, 17, 7–11. [Google Scholar] [CrossRef]
Author | Models | Subject | Weaknesses |
---|---|---|---|
Wu [30] | BERT and GPT | Software vulnerability detection | Relies on segmentation of source code and feature extraction |
Thapa et al. [31] | GPT-2 | Software Vulnerabilities in C/C++ source code | GPU memory constraints, managing model parallelism and dependencies |
Omar [32] | GPT-2 | Vulnerabilities in C/C++ codebases | High computational burden of traditional deep learning models like CNNs and LSTMs |
Sun et al. [33] | GPT-type models | Smart contract source code | Only “acceptable” precision for larger projects like Web3Bugs |
Cheshkov et al. [34] | GPT-3.5 and ChatGPT | CWE vulnerabilities in Java applications | Naive assumptions about GPT’s knowledge, results on par with dummy classifier |
Feng and Chen [35] | ChatGPT | Android bug replay | Limited by manually crafted patterns and pre-defined vocabulary lists |
Prompt Type | Token Count |
---|---|
Sensitivity detection prompt (Appendix A.1) | 905 tokens |
Feature extraction prompt (Appendices Appendix A.2 and Appendix A.3) | 1564 tokens |
AuthGuard and protection level evaluation prompt (Appendix A.4) | 2354 tokens |
Average token count of the largest files from the largest projects | 3560 tokens |
Average token count of the largest mapped JSON | 1145 tokens |
Name | Content |
---|---|
Type | The type of the analyzed file, component, service, etc. |
Selector | If the type is component, this field contains selector string specified in the decorator method of the class, which indicates the rendering locations in the html template codes which are handled by the component. |
Name | The name of the analyzed class. |
File | The name of the analyzed file. |
Path | If the parsed file type is component, the string of the application’s routing configuration to access it; empty string for embedded child components only. |
Guarded | Boolean value indicating the presence of an AuthGuard or AuthGuards protecting the class if it is a component. |
Guardlevel | A numeric value in the 0–3 interval according to the security levels defined in the previous section. |
Guards | The names of the AuthGuard classes protecting this class if it is a component. |
Navigations | A list containing the names of the components to which we can navigate forward from this class. |
Injections | A list of injected services in the class, also containing the exact interactions with them which might be asynchrons, attributes, or functions. |
Parents | The list of components, which contain the selector of this class (if it is a component) in their html template. |
Children | If the analyzed file is a component, then a list of the components that are contained in its html template by their selectors. |
Name | Content |
---|---|
Element | The name of the sensitive object or variable. |
Type | The type of the sensitive element, following a conviction similar to the list in prompt Appendix A.1. |
Level | The sensitivity level of the detected element. |
Origin | Userinput, service, or localstorage, the three most common sources of sensitive data in a web application. |
ServiceName | If the origin is service, then the name of the service which is the source of the sensitive element. |
Function | The role of the sensitive element in the component where it was detected. |
Goal | The fate of the sensitive element after its usage: stored, read, or service. |
GoalName | If the value of the goal field was service, then the name of the service to which the sensitive element is passed to. |
Classname | The name of the class in which the sensitive element was detected. |
Step | Average Runtime with GPT-4 | Average Runtime with GPT-3.5 |
---|---|---|
Sensitivity Element Detection | 216.846 s | 194.266 s |
JSON Mapping of Project Files | 217.334 s | 163.882 s |
Protection Level Discovery | 247.904 s | 140.559 s |
GPT-4 | GPT-3.5 | |
---|---|---|
Number of sensitive data elements | 363 | 375 |
Incorrect detection | 7 | 128 |
Duplicate detection | 25 | 23 |
Insufficient detection | 0 | 29 |
GPT-4 | GPT-3.5 | |
---|---|---|
Detected Sensitive Services | 37 | 28 |
Undetected Sensitive Services | 17 | 26 |
Incorrect Sensitivity Levels | 2 | 17 |
Insufficient Detection Count | 23 | 24 |
Missing Information | 3 | 5 |
GPT-4 | GPT-3.5 | |
---|---|---|
Components with Incorrect Service Detection | 0 | 23 |
Components with Incorrect AuthGuard Detection | 14 | 96 |
Components with Incorrect Protection Level | 4 | 112 |
GPT-4 | GPT-3.5 | |
---|---|---|
Detected Vulnerability | 40 | 17 |
False Vulnerability | 0 | 8 |
Undetected Vulnerability | 49 | 80 |
Undetected High-Sensitivity Vulnerability | 10 | 16 |
Undetected Medium-Sensitivity Vulnerability | 31 | 56 |
Undetected Low-Sensitivity Vulnerability | 8 | 8 |
GPT-4 Original | GPT-4 Strict | |
---|---|---|
Detected Vulnerability | 40 | 79 |
False Vulnerability | 0 | 0 |
Undetected Vulnerability | 49 | 10 |
Undetected High-Sensitivity Vulnerability | 10 | 0 |
Undetected Medium-Sensitivity Vulnerability | 31 | 3 |
Undetected Low-Sensitivity Vulnerability | 8 | 7 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Szabó, Z.; Bilicki, V. A New Approach to Web Application Security: Utilizing GPT Language Models for Source Code Inspection. Future Internet 2023, 15, 326. https://doi.org/10.3390/fi15100326
Szabó Z, Bilicki V. A New Approach to Web Application Security: Utilizing GPT Language Models for Source Code Inspection. Future Internet. 2023; 15(10):326. https://doi.org/10.3390/fi15100326
Chicago/Turabian StyleSzabó, Zoltán, and Vilmos Bilicki. 2023. "A New Approach to Web Application Security: Utilizing GPT Language Models for Source Code Inspection" Future Internet 15, no. 10: 326. https://doi.org/10.3390/fi15100326
APA StyleSzabó, Z., & Bilicki, V. (2023). A New Approach to Web Application Security: Utilizing GPT Language Models for Source Code Inspection. Future Internet, 15(10), 326. https://doi.org/10.3390/fi15100326