A New Algorithm for Detecting GPN Protein Expression and Overexpression of IDC and ILC Her2+ Subtypes on Polyacrylamide Gels Associated with Breast Cancer
Abstract
:1. Introduction
1.1. Novelty
1.2. Limitations and Challenges
2. Materials and Methods
2.1. Creation of Samples with Different GPN Concentrations
2.2. Image Acquisition
2.3. Preprocessing and Feature Extraction
Algorithm 1. Pseudocode to find the number of lanes and bands in the polyacrylamide gel image. | |
Algorithm for band and lane detection | |
1: | Resize the image to 600 × 400 px for light processing |
2: | if Excess_of_protein: |
3: | Histogram equalization |
4: | end if |
5: | Obtain a binarized Image |
6: | Image dilation |
7: | Image invert |
8: | Image erosion |
9: | Column = 1 |
10: | If Lane detection: |
11: | MAXWIDE = 400 px |
12: | else: # band detection |
13: | MAXWIDE = 50 px |
14: | end else |
15: | end if |
16: | Apply Binary Mask on the resized image |
17: | Initialize Array to zero |
18: | while Column ‹= MAXWIDE: |
19: | Get the Histogram_of_image |
20: | Otsu_Segmentation_Applied_to_Binary_Mask_Size_zone |
21: | Get the number of white pixels in the Histogram of the segmented region, Histogram[white_position] # get the quantity of white color in the histogram binarized |
22: | Array [Column] = Number_White_Pixels_Histogram [255] |
23: | Column++ |
24: | end while |
25: | Plott Array |
26: | if Lane_Analysis: |
27: | Multiple_Minimum_correlate_Lane_Separation(Array) |
28: | Multiple_Maximus_related_Band_Separation(Array) |
29: 30: | else#band_analysis Average_Multiples_Maximums_between_separations_To_Get_One_Maximum |
31: | end else |
32: | end if |
3. Results and Discussion
3.1. Traditional Analysis of GPN Protein Gels at Different Concentrations
3.2. Preprocessing and Feature Extraction Using the Proposed Algorithm
3.3. Detection of Protein Overexpression in Gels Using the IPBBIS Algorithm
3.3.1. Application of the New Intensity Profile on the Complete Gel Image
3.3.2. Application of the IPBBIS Method on a Sample with Controlled Concentrations
3.3.3. Effectiveness of the IPBBIS Method Using Known Concentrations
3.3.4. Elimination of Impurities through the Determination of the Molecular Weight of the Target Protein
3.3.5. Choice of Threshold for the Elimination of Multiple Maximums
3.3.6. Analysis of the Region of Interest Using the IPBBIS Methods, Manual Area Calculation, Area Calculation by K-Means Segmentation, and Area Calculation by Otsu Segmentation
3.3.7. IPBBIS Study on the Image Dataset Using the Confusion Matrix
3.3.8. Functionality of the Methods Analyzed to Find GPN Protein Overexpression: IPBBIS, Manual Area Calculation, Area Calculation by K-Means Segmentation, and Area Calculation by Otsu Segmentation
4. Conclusions
- Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Kaabouch, N.; Schultz, R.R.; Milavetz, B. An analysis system for DNA Gel Electrophoresis images based on automatic thresholding and enhancement. In Proceedings of the 2007 IEEE International Conference on Electro/Information Technology, Chicago, IL, USA, 17–20 May 2007; pp. 1–6. [Google Scholar]
- Ferrari, M.; Cremonesi, L.; Carrera, P.; Bonini, P. Diagnosis of genetic disease by DNA technology. Pure Appl. Chem. 1991, 63, 1089–1096. [Google Scholar] [CrossRef]
- Goez, M.M.; Torres-Madroñero, M.C.; Röthlisberger, S.; Delgado-Trejo, E. Preprocessing of 2-Dimensional Gel Electrophoresis Images Applied to Proteomic Analysis: A Review. Genom. Proteom. Bioinform. 2018, 16, 63–72. [Google Scholar] [CrossRef] [PubMed]
- Intarapanich, A.; Kaewkamnerd, S.; Shaw, P.J.; Ukosakit, K.; Tragoonrung, S.; Tongsima, S. Automatic DNA diagnosis for 1D Gel Electrophoresis Images using Bio-image Processing Technique. BMC Genom. 2015, 16, S15. [Google Scholar] [CrossRef] [PubMed]
- Jian-Derr, L.; Chung-Hsien, H.; Neng-Wei, W.; Chen-Song, L. Automatic DNA sequencing for electrophoresis gels using image processing algorithms. J. Biomed. Sci. Eng. 2011, 4, 523–528. [Google Scholar]
- Taher, R.S.; Jamil, N.; Nordin, S.; Bahari, U.M. A new false peak elimination method for poor DNA gel images analysis. In Proceedings of the 2014 14th International Conference on Intelligent Systems Design and Applications, Okinawa, Japan, 28–30 November 2014; pp. 180–186. [Google Scholar]
- Koprowski, R.; Wróbel, Z.; Korzynska, A.; Chwialkowska, K.; Kwasniewski, M. Automatic analysis of 2D polyacrylamide gels in the diagnosis of DNA polymorphisms. Biomed. Eng. 2013, 12, 68. [Google Scholar] [CrossRef] [PubMed]
- Cai, F.; Liu, S.; Dijke, P.T.; Verbeek, F.J. Image analysis and pattern extraction of proteins classes from one-dimensional gels electrophoresis. Int. J. Biosci. Biochem. Bioinform. 2017, 7, 201–212. [Google Scholar] [CrossRef]
- Ahmed, N.E. EgyGene GelAnalyzer4: A powerful image analysis software for one-dimensional gel electrophoresis. J. Genet. Eng. Biotechnol. 2021, 19, 18. [Google Scholar] [CrossRef]
- Alnamoly, M.H.; Alzohairy, A.M.; Mahmoud, I.; El-Henawy, I.M. EGBIOIMAGE: A software tool for gel images analysis and hierarchical clustering. IEEE Access 2019, 8, 10768–10781. [Google Scholar] [CrossRef]
- Juárez, J.; Guevara-Villa, M.; Sánchez-Sánchez, A.; Díaz-Hernández, R.; Altamirano-Robles, L. Tridimensional structure prediction and purification of human protein GPN2 to high concentrations by nickel affinity chromatography in presence of amino acids for improving impurities elimination. In Transactions on Computational Science & Computational Intelligence; Springer Nature: Cham, Switzerland, 2021. [Google Scholar]
- Abadi, M.F. Processing of DNA and Protein Electrophoresis Gels by Image Processing. Sci. J. 2015, 36, 3486–3494. [Google Scholar]
- Abeykoon, A.; Dhanapala, M.; Yapa, R.; Sooriyapathirana, S. An automated system for analyzing agarose and polyacrylamide gel images. Ceylon J. Sci. 2015, 44, 45–54. [Google Scholar] [CrossRef]
- Bajla, I.; Holländer, I.; Fluch, S.; Burg, K.; Kollár, M. An alternative method for electrophoresis gel image analysis in the GelMaster software. Comput. Methods Programs Biomed. 2005, 77, 209–231. [Google Scholar] [CrossRef] [PubMed]
- Brauner, J.M.; Groemer, T.W.; Stroebel, A.; Grosse-Holz, S.; Oberstein, T.; Wiltfeang, J.; Maler, J.M. Spot quantification in two-dimensional gel electrophoresis image analysis: Comparison of different approaches and presentation of a novel compound fitting algorithm. Bioinformatics 2014, 15, 181. [Google Scholar] [CrossRef] [PubMed]
- Efrat, A.; Hoffmann, F.; Kriegel, K.; Schultz, C.; Wenk, C. Geometric algorithms for the analysis of 2D-Electrophoresis gels. J. Comput. Biol. 2002, 9, 299–315. [Google Scholar] [CrossRef]
- Faisal, M.; Vasiljevic, T.; Donkor, O.N. A review on methodologies for extraction, identification and quantification of allergenic proteins in prawns. Food Res. Int. 2019, 121, 307–318. [Google Scholar] [CrossRef]
- Fernández-Lozano, C.; Seoane, J.A.; Gestal, M.; Gaunt, T.R.; Dorado, J.; Pazos, A.; Campbell, C. Texture analysis in gel electrophoresis images using an integrative kernel-based approach. Sci. Rep. 2016, 6, 19256. [Google Scholar] [CrossRef] [PubMed]
- Kaur, N.; Sharma, P.; Jaimni, S.; Kehinde, B.A.; Kaur, S. Recent developments in purification techniques and industrial applications for whey valorization: A review. Chem. Eng. Commun. 2019, 207, 123–138. [Google Scholar] [CrossRef]
- Labyed, N.; Kaabouch, N.; Schultz, R.R.; Singh, B.B. Automatic segmentation and band detection of protein images based on the standard deviation profile and its derivative. In Proceedings of the 2007 IEEE International Conference on Electro/Information Technology, Chicago, IL, USA, 17–20 May 2007; pp. 577–582. [Google Scholar]
- Ramaswamy, G.; Wu, B.; MacEvilly, U. Knowledge management of 1D SDS PAGE Gel protein image information. J. Digit. Inf. Manag. 2010, 8, 223–232. [Google Scholar]
- Rezaei, M.; Amiri, M.; Mohajery, P. A new algorithm for lane detection and tracking on pulsed field gel electrophoresis images. Chemom. Intell. Lab. Syst. 2016, 157, 1–6. [Google Scholar] [CrossRef]
- Viswanathan, S.; Ünlü, M.; Minden, J. Two-dimensional difference gel electrophoresis. Nat. Protoc. 2006, 1, 1351–1358. [Google Scholar] [CrossRef]
- Heras, J.; Domínguez, C.; Mata, E.; Pascual, V.; Lozano, C.; Torres, C.; Zarazaga, M. GelJ—A tool for analyzing DNA fingerprint gel images. BMC Bioinform. 2015, 16, 270. [Google Scholar] [CrossRef]
- Alawdi, R.M.; Amer RB, M.; Alzohairy, A.M.; Khedr, W.M. The Computational Techniques Developed to Analyze DNA Gel Images. Int. J. Adv. Eng. Res. Sci. 2016, 3, 139–149. [Google Scholar]
- Heras, J.; Domínguez, C.; Mata, E.; Pascual, V.; Lozano, C.; Torres, C.; Zarazaga, M. A survey of tools for analysing DNA fingerprints. Brief. Bioinform. 2015, 17, 903–911. [Google Scholar] [CrossRef] [PubMed]
- Pavel, A.B.; Vasile, C.I. PyElph-a software tool for gel images analysis and phylogenetics. BMC Bioinform. 2012, 13, 9. [Google Scholar] [CrossRef] [PubMed]
- Khakabimamaghani, S.; Najafi, A.; Ranjbar, R.; Raam, M. GelClust: A software tool for gel electrophoresis images analysis and dendrogram generation. Comput. Methods Programs Biomed. 2013, 111, 512–518. [Google Scholar] [CrossRef] [PubMed]
- Alnamoly, M.H.; Alzohairy, A.M.; El-Henawy, I.M. A survey on gel image analysis software tools. J. Intell. Syst. Internet Things 2020, 1, 40–47. [Google Scholar]
- Juárez-Lucero, J.; Guevara-Villa, M.G.; Sánchez-Sánchez, A.; Díaz-Hernández, R.; Altamirano-Robles, L. Development of a Methodology to Adapt an Equilibrium Buffer/Wash Applied to the Purification of hGPN2 Protein Expressed in Escherichia coli Using an IMAC Immobilized Metal Affinity Chromatography System. Separations 2022, 9, 164. [Google Scholar] [CrossRef]
- Lara-Chacón, B.; Guerrero-Rodríguez, S.L.; Ramírez-Hernández, K.J.; Robledo-Rivera, A.Y.; Velazquez MA, V.; Sánchez-Olea, R.; Calera, M.R. Gpn3 is essential for cell proliferation of breast cancer cells independent of their malignancy degree. Technol. Cancer Res. Treat. 2019, 18. [Google Scholar] [CrossRef]
- Juárez, J.; Guevara-Villa MD, R.; Sánchez, A.; Díaz, R.; Altamirano, L. Image Segmentation Applied to Line Separation and Determination of GPN2 Protein Overexpression for Its Detection in Polyacrylamide Gels. In Progress in Artificial Intelligence and Pattern Recognition; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2021; pp. 303–315. [Google Scholar]
Lane | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
Value | 79 | 71 | 58 | 40 | 23 | 29 | 45 | 61 | 41 | 45 | 41 |
Lane | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
Concentration mg/mL | 2.0 | 0.0 | 14.5 | 9.5 | 27 | 18 | 30 | 18.5 | 26 | 14 |
Gel |
Interpolation Method | Calculated Weight (kDa) | Total Error % |
---|---|---|
Linear | 33.4 | 3.35648148 |
Nearest | 37.0 | 7.060185185 |
Cubic | 31.38 | 9.194960019 |
Lane | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
Concentration mg/mL | 2.0 | 0.0 | 14.5 | 9.5 | 27 | 18 | 30 | 18.5 | 26 | 14 |
ROI-GPN | 10.38 | 0.0 | 20.55 | 18.67 | 27.34 | 22.87 | 28.82 | 23.08 | 27.26 | 20.11 |
Area Manual | 513.62 | 0.0 | 830.25 | 680.12 | 1373.80 | 830.00 | 1061.50 | 934.50 | 983.50 | 869.25 |
Area K-means segmentation | 495.50 | 0.0 | 969.62 | 933.00 | 1457.20 | 1132.20 | 1336.20 | 1134.90 | 1422.10 | 993.13 |
Area Otsu segmentation | 535.75 | 0.0 | 1189.8 | 1023.4 | 1646.8 | 1318.2 | 1557.1 | 1269.1 | 1585.9 | 1174.6 |
Predicted | |||
---|---|---|---|
Positive | Negative | ||
Real | Positive | TP = 310 | FN = 8 |
Negative | FP = 2 | TN = 349 |
Accuracy of Homogeneous Gels | Accuracy of Heterogeneous Gels |
---|---|
0.985052 | 0.91736 |
Predicted | |||
---|---|---|---|
Positive | Negative | ||
Real | Positive | TP = 671 | FN = 105 |
Negative | FP = 24 | TN = 761 |
Concentration mg/mL | 0 | 2 | 9.5 | 14 | 14.5 | 18 | 18.5 | 26 | 27 | 30 |
Lane | 3 | 2 | 5 | 11 | 4 | 7 | 9 | 10 | 6 | 8 |
ROI-GPN | 0 | 10.38 | 18.67 | 20.11 | 20.55 | 22.87 | 23.08 | 27.26 | 27.34 | 28.82 |
Manual Area | 0 | 513.6 | 680.1 | 869.25 | 830.25 | 830 | 934.5 | 983.5 | 1373.8 | 1061.5 |
K-means segmentation Area | 0 | 495.5 | 933 | 993.13 | 969.62 | 1132 | 1134.9 | 1422.1 | 1457.2 | 1336.2 |
Otsu segmentation Area | 0 | 535.8 | 1023 | 1174.6 | 1189.8 | 1318 | 1269.1 | 1585.9 | 1646.8 | 1557.1 |
Normalized Data | ||||||||||
Concentration mg/mL | 0.00 | 0.07 | 0.32 | 0.47 | 0.48 | 0.60 | 0.62 | 0.87 | 0.90 | 1.00 |
Lane | 3 | 2 | 5 | 11 | 4 | 7 | 9 | 10 | 6 | 8 |
ROI-GPN | 0.00 | 0.36 | 0.65 | 0.70 | 0.71 | 0.79 | 0.80 | 0.95 | 0.95 | 1.00 |
Manual Area | 0.00 | 0.37 | 0.50 | 0.63 | 0.60 | 0.60 | 0.68 | 0.72 | 1.00 | 0.77 |
K-means segmentation Area | 0.00 | 0.34 | 0.64 | 0.68 | 0.67 | 0.78 | 0.78 | 0.98 | 1.00 | 0.92 |
Otsu segmentation Area | 0.00 | 0.33 | 0.62 | 0.71 | 0.72 | 0.80 | 0.77 | 0.96 | 1.00 | 0.95 |
Concentration mg/mL | 0.00 | 0.07 | 0.32 | 0.47 | 0.48 | 0.60 | 0.62 | 0.87 | 0.90 | 1.00 |
Lane | 3 | 2 | 5 | 11 | 4 | 7 | 9 | 10 | 6 | 8 |
ROI-GPN | 0.00 | 0.36 | 0.29 | 0.05 | 0.02 | 0.08 | 0.01 | 0.15 | 0.00 | 0.05 |
Comparison of Manual Area | 0.00 | 0.37 | 0.12 | 0.14 | −0.03 | 0.00 | 0.08 | 0.04 | 0.28 | −0.23 |
Comparison of K-means Segmentation Area | 0.00 | 0.34 | 0.30 | 0.04 | −0.02 | 0.11 | 0.00 | 0.20 | 0.02 | −0.08 |
Comparison of Otsu Segmentation Area | 0.00 | 0.33 | 0.30 | 0.09 | 0.01 | 0.08 | −0.03 | 0.19 | 0.04 | −0.05 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Juarez-Lucero, J.; Guevara-Villa, M.; Sanchez-Sanchez, A.; Diaz-Hernandez, R.; Altamirano-Robles, L. A New Algorithm for Detecting GPN Protein Expression and Overexpression of IDC and ILC Her2+ Subtypes on Polyacrylamide Gels Associated with Breast Cancer. Algorithms 2024, 17, 149. https://doi.org/10.3390/a17040149
Juarez-Lucero J, Guevara-Villa M, Sanchez-Sanchez A, Diaz-Hernandez R, Altamirano-Robles L. A New Algorithm for Detecting GPN Protein Expression and Overexpression of IDC and ILC Her2+ Subtypes on Polyacrylamide Gels Associated with Breast Cancer. Algorithms. 2024; 17(4):149. https://doi.org/10.3390/a17040149
Chicago/Turabian StyleJuarez-Lucero, Jorge, Maria Guevara-Villa, Anabel Sanchez-Sanchez, Raquel Diaz-Hernandez, and Leopoldo Altamirano-Robles. 2024. "A New Algorithm for Detecting GPN Protein Expression and Overexpression of IDC and ILC Her2+ Subtypes on Polyacrylamide Gels Associated with Breast Cancer" Algorithms 17, no. 4: 149. https://doi.org/10.3390/a17040149
APA StyleJuarez-Lucero, J., Guevara-Villa, M., Sanchez-Sanchez, A., Diaz-Hernandez, R., & Altamirano-Robles, L. (2024). A New Algorithm for Detecting GPN Protein Expression and Overexpression of IDC and ILC Her2+ Subtypes on Polyacrylamide Gels Associated with Breast Cancer. Algorithms, 17(4), 149. https://doi.org/10.3390/a17040149