Augmented Behavioral Annotation Tools, with Application to Multimodal Datasets and Models: A Systematic Review
Abstract
:1. Introduction
Review Justification
- The use of a robust research methodology to identify, collate, and analyze papers that provide insights on technologies applicable to behavioral annotation processes (Section 2)
- A classification and discussion of studies that evaluate educational aspects of such behavioral annotation systems (Section 3)
- A digest of the major developments, and the expected future path of this research domain (Section 4)
2. Methods and Literature Review
2.1. Research Questions
2.2. Inclusion Criteria
3. Techniques of Augmented Behavioral Annotation
3.1. RQ1—What Methodologies and Frameworks Can Facilitate Annotation, Especially Those with a Multimodal Nature?
3.1.1. Segmentation Challenges and Opportunities
3.1.2. Working with Limited Quantities of Data
3.2. RQ2—How to Encode Data in Formats Which Facilitate Safe and Ethical Interchange, as Well as the Coding of Expansive and Representative Modalities/Categorizations?
3.2.1. Annotation Layers
3.2.2. Ethical Observations
3.2.3. Accessibility, Diversity, and Inclusion
3.2.4. Disproportional or Unfair Bias
3.2.5. Common Weight Space Merging
3.2.6. Prompt Injection
3.2.7. Distributional Shift
3.2.8. Copyright Issues
3.3. RQ3—How to Streamline the User Experience to Reduce Cognitive Load and Training Requirements?
3.3.1. Annotation Completion
3.3.2. Minimal Notation
3.3.3. Algorithmic Explication and Exegesis
3.3.4. Brainstorming, Summarization, and Analogizing
3.3.5. Prompt-Based Annotation
3.4. RQ4—How to Augment User Contributions to Increase Their Impact?
3.4.1. Driving Engagement
3.4.2. Collaboration
3.4.3. Indirect Collaboration Efforts
- (a)
- Enable the free anonymous expression of annotations.
- (b)
- Reward collaboration by likes.
- (c)
- Provide broader meaning to the annotation experience, by understanding how one’s actions have assisted others.
- (d)
- Entrust certain tasks to others for voluntary fulfilment to ensure completion.
3.4.4. Data Augmentation and Validation
3.5. RQ5—How to Validate Coded Information as Being Reasonable and Appropriate?
3.5.1. Context
3.5.2. Contextual Analysis
3.5.3. Analogy Mapping
3.5.4. Duplicate Monitoring
3.5.5. Annotator Feedback Applied to Pre-and Post-Annotation
3.5.6. Annotation Failure Cases
Interface too Cumbersome or Boring
Lack of Engagement, Progress, or Meaning
Lack of Consensus, or Conversely, Groupthink
Vandalism
Polarization and Community Conflict
Unintended Consequences of Bounties
3.6. RQ6—How to Pre-Process Data or to Permit Pre-Annotation?
3.6.1. Pre-Annotation
3.6.2. Post-Annotation Validation
3.6.3. Personal Tuning through Prompt Engineering
3.6.4. Scenario Generation
3.7. RQ7—How Can Transformer-Type Technologies Be Applied to Annotation?
3.7.1. Diffusion Models
3.7.2. Towards Generalizable Machine Intelligence
3.7.3. Generalizable Training Data
- Pre-training and fine-tuning a powerful task-agnostic model on a large unsupervised data and then fine-tuning it on the downstream task with a small set of labeled samples.
- Semi-supervised learning from the labelled and unlabeled samples together.
- Active learning: learns to select most valuable unlabeled samples to be collected next and helps us act smartly with a limited budget.
- Pre-training and dataset auto-generation with a capable pre-trained model, utilized to auto-generate further labeled samples. This approach has been especially popular within the language domain, driven by the success of few-shot learning.
4. Gaps and Opportunities for Further Research
5. Final Considerations
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Athey, S. The Impact of Machine Learning on Economics. In Economics of Artificial Intelligence; University of Chicago Press: Chicago, IL, USA, 2019. [Google Scholar]
- ITUTrends. Assessing the Economic Impact of Artificial Intelligence; ITUTrends: Geneva, Switzerland, 2018. [Google Scholar]
- Ipsos MORI. Public Views of Machine Learning; Ipsos MORI: London, UK, 2017. [Google Scholar]
- Magudia, K.; Bridge, C.P.; Andriole, K.P.; Rosenthal, M.H. The Trials and Tribulations of Assembling Large Medical Imaging Datasets for Machine Learning Applications. J. Digit. Imaging 2021, 34, 1424–1429. [Google Scholar] [CrossRef] [PubMed]
- Piwowar, H.A.; Vision, T.J. Data reuse and the open data citation advantage. PeerJ 2013, 1, e175. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Thiyagalingam, J.; Shankar, M.; Fox, G.; Hey, T. Scientific machine learning benchmarks. Nat. Rev. Phys. 2022, 4, 413–420. [Google Scholar] [CrossRef]
- Roh, Y.; Heo, G.; Whang, S. A Survey on Data Collection for Machine Learning: A Big Data—Ai Integration Perspective. IEEE Trans. Knowl. Data Eng. 2021, 33, 1328–1347. [Google Scholar] [CrossRef] [Green Version]
- Guyon, I. A Scaling Law for the Validation-Set Training-Set Size Ratio; AT&T Bell Laboratories: Murray Hill, NJ, USA, 1997. [Google Scholar]
- Fernando, M.; Cèsar, F.; David, N.; José, H. Missing the missing values: The ugly duckling of fairness in machine learning. Int. J. Intell. Syst. 2021, 36, 3217–3258. [Google Scholar] [CrossRef]
- Northcutt, C.G.; Athalye, A.; Mueller, J. Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks. arXiv 2021, arXiv:2103.14749. [Google Scholar]
- Wissner-Gross, A. What Do You Consider the Most Interesting Recent [Scientific] New? What Makes It Important? Edge: Tel Aviv, Israel, 2016. [Google Scholar]
- Heilbron, F.C.; Escorcia, V.; Ghanem, B.; Niebles, J. Activitynet: A Large-Scale Video Benchmark for Human Activity Understanding. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Chinchilla’s Wild Implications. Available online: https://www.alignmentforum.org/posts/6Fpvch8RR29qLEWNH/chinchilla-s-wild-implications (accessed on 18 October 2022).
- (My Understanding of) What Everyone in Technical Alignment Is Doing and Why. Available online: https://www.lesswrong.com/posts/QBAjndPuFbhEXKcCr/my-understanding-of-what-everyone-in-technical-alignment-is (accessed on 18 October 2022).
- Barrett, J.; Viana, T. Emm-Lc Fusion: Enhanced Multimodal Fusion for Lung Cancer Classification. Ai 2022, 3, 659–682. [Google Scholar] [CrossRef]
- Moravec, H.P. When Will Computer Hardware Match the Human Brain. J. Evol. Technol. 1998, 1, 10. [Google Scholar]
- No Language Left Behind: Scaling Human-Centered Machine Translation. Available online: https://research.facebook.com/publications/no-language-left-behind/ (accessed on 18 October 2022).
- Bhoopchand, A.; Brownfield, B.; Collister, A.; Lago, A.; Edwards, A.; Everett, R.; Frechette, A.; Oliveira, Y.; Hughes, E.; Mathewson, K.; et al. Learning Robust Real-Time Cultural Transmission without Human Data. arXiv 2022, arXiv:2203.00715. [Google Scholar]
- Mirowski, P.W.; Mathewson, K.; Pittman, J.; Evans, R. Co-Writing Screenplays and Theatre Scripts with Language Models: An Evaluation by Industry Professionals. arXiv 2022, arXiv:2209.14958. [Google Scholar]
- Adate, A.; Arya, D.; Shaha, A.; Tripathy, B. Impact of Deep Neural Learning on Artificial Intelligence Research. In Deep Learning: Research and Applications; De Gruyter: Berlin, Germany, 2020; pp. 68–84. [Google Scholar]
- Sarker, I.H. Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions. SN Comput. Sci. 2021, 6, 420. [Google Scholar] [CrossRef] [PubMed]
- Weissler, E.H.; Naumann, T.; Andersson, T.; Ranganath, R.; Elemento, O.; Luo, Y.; Freitag, D.; Benoit, J.; Hughes, M.; Khan, F.; et al. The Role of Machine Learning in Clinical Research: Transforming the Future of Evidence Generation. Trials 2021, 22, 1–15. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Bommasani, R.; Hudson, D.; Adeli, E.; Altman, R.; Arora, S.; Arx, S.; Bernstein, M.; Bohg, J.; Bosselut, A.; Brunskill, E.; et al. On the Opportunities and Risks of Foundation Models. arXiv 2021, arXiv:2108.07258. [Google Scholar]
- Liang, P.P.; Zadeh, A.; Morency, L.-P. Foundations and Recent Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions. arXiv 2022, arXiv:2209.03430. [Google Scholar]
- Kitchenham, B.; Charters, S. Guidelines for Performing Systematic Literature Reviews in Software Engineering. Available online: https://www.elsevier.com/__data/promis_misc/525444systematicreviewsguide.pdf (accessed on 22 October 2022).
- Glanville, J.; McCool, R. What Is a Systematic Review? Evid. Based Health Care 2014, 14, 3. [Google Scholar]
- Wohlin, C.; Runeson, P.; Höst, M.; Ohlsson, M.; Regnell, B.; Wesslén, A. Experimentation in Software Engineering. Available online: https://link.springer.com/book/10.1007/978-3-642-29044-2 (accessed on 22 October 2022).
- Brereton, P.; Kitchenham, B.; Budgen, D.; Turner, M.; Khalil, M. Lessons from Applying the Systematic Literature Review Process within the Software Engineering Domain. J. Syst. Softw. 2007, 80, 571–583. [Google Scholar] [CrossRef] [Green Version]
- Martinho, D.; Carneiro, J.; Corchado, J.; Marreiros, G. A Systematic Review of Gamification Techniques Applied to Elderly Care. Artif. Intell. Rev. 2020, 53, 4863–4901. [Google Scholar] [CrossRef]
- Xiao, Y.; Watson, M. Guidance on Conducting a Systematic Literature Review. J. Plan. Educ. Res. 2017, 39, 93–112. [Google Scholar] [CrossRef]
- Novak, I.; Hines, M.; Goldsmith, S.; Barclay, R. Clinical Prognostic Messages from a Systematic Review on Cerebral Palsy. Pediatrics 2012, 130, e1285–e1312. [Google Scholar] [CrossRef] [Green Version]
- Introduction to Conducting a Systematic Review (Online via Zoom). Available online: https://calendar.lib.unc.edu/event/7216262 (accessed on 16 November 2021).
- Dyba, T.; Kitchenham, B.; Jorgensen, M. Evidence-Based Software Engineering for Practitioners. IEEE Softw. 2005, 22, 58–65. [Google Scholar] [CrossRef] [Green Version]
- Kitchenham, B.A.; Dyba, T.; Jorgensen, M. Evidence-Based Software Engineering. In Proceedings of the 26th International Conference on Software Engineering, Washington, DC, USA, 23–28 May 2004. [Google Scholar]
- Kitchenham, B.A.; Budgen, D.; Brereton, P. Evidence-Based Software Engineering and Systematic Reviews; CRC Press: Boca Raton, FL, USA, 2015. [Google Scholar]
- Wohlin, C.; Prikladnicki, R. Systematic Literature Reviews in Software Engineering. Inf. Softw. Technol. 2013, 55, 919–920. [Google Scholar] [CrossRef]
- Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). Available online: https://www.prisma-statement.org (accessed on 23 October 2022).
- Hamilton, M.; Zhang, Z.; Hariharan, B.; Snavely, N.; Freeman, W. Unsupervised Semantic Segmentation by Distilling Feature Correspondences. arXiv 2022, arXiv:2203.08414. [Google Scholar]
- Liu, A.H.; Jin, S.; Lai, C.-I.; Rouditchenko, A.; Oliva, A.; Glass, J. Cross-Modal Discrete Representation Learning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022. [Google Scholar]
- Kolesnikov, A.; Pinto, A.; Beyer, L.; Zhai, X.; Harmsen, J.; Houlsby, N. Uvim: A Unified Modeling Approach for Vision with Learned Guiding Codes. arXiv 2022, arXiv:2205.10337. [Google Scholar]
- Elmoznino, E.; Bonner, M. High-Performing Neural Network Models of Visual Cortex Benefit from High Latent Dimensionality. bioRxiv, 2022; preprint. [Google Scholar]
- Qin, B.; Mao, H.; Zhang, R.; Zhu, Y.; Ding, S.; Chen, X. Working Memory Inspired Hierarchical Video Decomposition with Transformative Representations. arXiv 2022, arXiv:2204.10105. [Google Scholar]
- Parthasarathy, N.; Eslami, S.; Carreira, J.; Hénaff, O. Self-Supervised Video Pretraining Yields Strong Image Representations. arXiv 2022, arXiv:2210.06433. [Google Scholar]
- H’enaff, O.J.; Koppula, S.; Shelhamer, E.; Zoran, D.; Jaegle, A.; Zisserman, A.; Carreira, J.; Arandjelovi’c, R. Object Discovery and Representation Networks. arXiv 2022, arXiv:2203.08777. [Google Scholar]
- Chen, X.; Wang, X.; Changpinyo, S.; Piergiovanni, A.; Padlewski, P.; Salz, D.; Goodman, S.; Grycner, A.; Mustafa, B.; Beyer, L.; et al. Pali: A Jointly-Scaled Multilingual Language-Image Model. arXiv 2022, arXiv:2209.06794. [Google Scholar]
- Alayrac, J.-B.; Donahue, J.; Luc, P.; Miech, A.; Barr, I.; Hasson, Y.; Lenc, K.; Mensch, A.; Millican, K.; Reynolds, M.; et al. Flamingo: A Visual Language Model for Few-Shot Learning. arXiv 2022, arXiv:2204.14198. [Google Scholar]
- Girdhar, R.; Singh, M.; Ravi, N.; Maaten, L.; Joulin, A.; Misra, I. Omnivore: A Single Model for Many Visual Modalities. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 16081–16091. [Google Scholar]
- Hernandez, E.; Schwettmann, S.; Bau, D.; Bagashvili, T.; Torralba, A.; Andreas, J. Natural Language Descriptions of Deep Visual Features. arXiv 2022, arXiv:2201.11114. [Google Scholar]
- Baevski, A.; Hsu, W.-N.; Xu, Q.; Babu, A.; Gu, J.; Auli, M. Data2vec: A General Framework for Self-Supervised Learning in Speech, Vision and Language. arXiv 2022, arXiv:2202.03555. [Google Scholar]
- Meng, Y.; Huang, J.; Zhang, Y.; Han, J. Generating Training Data with Language Models: Towards Zero-Shot Language Understanding. arXiv 2022, arXiv:2202.04538. [Google Scholar]
- Whitfield, D. Using GPT-2 to Create Synthetic Data to Improve the Prediction Performance of NLP Machine Learning Classification Models. arXiv 2021, arXiv:2104.10658. [Google Scholar]
- Uchendu, I.; Xiao, T.; Lu, Y.; Zhu, B.; Yan, M.; Simón, J.; Bennice, M.; Fu, C.; Ma, C.; Jiao, J.; et al. Jump-Start Reinforcement Learning. arXiv 2022, arXiv:2204.02372. [Google Scholar]
- Nichol, A.; Dhariwal, P.; Ramesh, A.; Shyam, P.; Mishkin, P.; McGrew, B.; Sutskever, I.; Chen, M. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. arXiv 2021, arXiv:2112.10741. [Google Scholar]
- Borzunov, A.; Baranchuk, D.; Dettmers, T.; Ryabinin, M.; Belkada, Y.; Chumachenko, A.; Samygin, P.; Raffel, C. Petals: Collaborative Inference and Fine-Tuning of Large Models. arXiv 2022, arXiv:2209.01188. [Google Scholar]
- Our Approach to Alignment Research. Available online: https://openai.com/blog/our-approach-to-alignment-research/ (accessed on 18 October 2022).
- Ouyang, L.; Wu, J.; Jiang, X.; Almeida, D.; Wainwright, C.; Mishkin, P.; Zhang, C.; Agarwal, S.; Slama, K.; Ray, A.; et al. Training Language Models to Follow Instructions with Human Feedback. arXiv 2022, arXiv:2203.02155. [Google Scholar]
- The First High-Performance Self-Supervised Algorithm That Works for Speech, Vision, and Text. Available online: https://ai.facebook.com/blog/the-first-high-performance-self-supervised-algorithm-that-works-for-speech-vision-and-text/ (accessed on 18 October 2022).
- Tiu, E.; Talius, E.; Patel, P.; Langlotz, C.; Ng, A.; Rajpurkar, P. Expert-Level Detection of Pathologies from Unannotated Chest X-Ray Images Via Self-Supervised Learning. Nat. Biomed. Eng. 2022, 6, 1399–1406. [Google Scholar] [CrossRef]
- Thrush, T.; Jiang, R.; Bartolo, M.; Singh, A.; Williams, A.; Kiela, D.; Ross, C. Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 5228–5238. [Google Scholar]
- Does This Artificial Intelligence Think Like a Human? Available online: https://news.mit.edu/2022/does-this-artificial-intelligence-think-human-0406 (accessed on 18 October 2022).
- Botach, A.; Zheltonozhskii, E.; Baskin, C. End-to-End Referring Video Object Segmentation with Multimodal Transformers. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 4975–4985. [Google Scholar]
- Plotz, T.; Chen, C.; Hammerla, N.; Abowd, G. Automatic Synchronization of Wearable Sensors and Video-Cameras for Ground Truth Annotation—A Practical Approach. In Proceedings of the 2012 16th International Symposium on Wearable Computers, Newcastle, UK, 18–22 June 2012; pp. 100–103. [Google Scholar]
- Marcus, G.; Davis, E.; Aaronson, S. A Very Preliminary Analysis of Dall-E 2. arXiv 2022, arXiv:2204.13807. [Google Scholar]
- Dall·E 2. Available online: https://openai.com/dall-e-2/ (accessed on 18 October 2022).
- What Dall-E 2 Can and Cannot Do. Available online: https://www.lesswrong.com/posts/uKp6tBFStnsvrot5t/what-dall-e-2-can-and-cannot-do (accessed on 18 October 2022).
- OpenAI: DALL·E 2 Preview—Risks and Limitations. Available online: https://github.com/openai/dalle-2-preview/blob/main/system-card.md (accessed on 18 October 2022).
- Everything You Wanted to Know About Midjourney. Available online: https://dallery.gallery/midjourney-guide-ai-art-explained/ (accessed on 18 October 2022).
- Ai by the People, for the People. Available online: https://stability.ai (accessed on 18 October 2022).
- Craiyon Home Page. Available online: https://www.craiyon.com/ (accessed on 18 October 2022).
- Yu, J.; Xu, Y.; Koh, J.; Luong, T.; Baid, G.; Wang, Z.; Vasudevan, V.; Ku, A.; Yang, Y.; Ayan, B.; et al. Scaling Autoregressive Models for Content-Rich Text-to-Image Generation. arXiv 2022, arXiv:2206.10789. [Google Scholar]
- Imagen. Available online: https://gweb-research-imagen.appspot.com/paper.pdf (accessed on 18 October 2022).
- Kawar, B.; Zada, S.; Lang, O.; Tov, O.; Chang, H.; Dekel, T.; Mosseri, I.; Irani, M. Imagic: Text-Based Real Image Editing with Diffusion Models. arXiv 2022, arXiv:2210.09276. [Google Scholar]
- Huang, X.; Mallya, A.; Wang, T.-C.; Liu, M.-Y. Multimodal Conditional Image Synthesis with Product-of-Experts Gans. arXiv 2021, arXiv:2112.05130. [Google Scholar]
- The Gradient. Available online: https://thegradient.pub/nlps-clever-hans-moment-has-arrived/ (accessed on 18 October 2022).
- Katada, S.; Okada, S.; Komatani, K. Effects of Physiological Signals in Different Types of Multimodal Sentiment Estimation. IEEE Trans. Affect. Comput. 2022, 1. [Google Scholar] [CrossRef]
- Ramrakhya, R.; Undersander, E.; Batra, D.; Das, A. Habitat-Web: Learning Embodied Object-Search Strategies from Human Demonstrations at Scale. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 5163–5173. [Google Scholar]
- Google Ai Blog: Simple and Effective Zero-Shot Task-Oriented Dialogue. Available online: https://ai.googleblog.com/2022/04/simple-and-effective-zero-shot-task.html (accessed on 18 October 2022).
- Google Ai Blog: Introducing the Schema-Guided Dialogue Dataset for Conversational Assistants. Available online: https://ai.googleblog.com/2019/10/introducing-schema-guided-dialogue.html (accessed on 18 October 2022).
- Chen, T.; La, L.; Saxena, S.; Hinton, G.; Fleet, D. A Generalist Framework for Panoptic Segmentation of Images and Videos. arXiv 2022, arXiv:2210.06366. [Google Scholar]
- Yu, R.; Park, H.; Lee, J. Human Dynamics from Monocular Video with Dynamic Camera Movements. ACM Trans. Graph. 2021, 40, 1–14. [Google Scholar] [CrossRef]
- EPFL: Realistic Graphics Lab. Available online: http://rgl.epfl.ch/publications/Vicini2022SDF (accessed on 18 October 2022).
- Botach, A.; Zheltonozhskii, E.; Baskin, C. Technion – Israel Institute of Technology: End-to-End Referring Video Object Segmentation with Multimodal Transformers. Available online: https://github.com/mttr2021/MTTR (accessed on 18 October 2022).
- Li, X.L.; Liang, P. Prefix-Tuning: Optimizing Continuous Prompts for Generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Online, 1–6 August 2021; pp. 4582–4597. [Google Scholar]
- Pang, B.; Nijkamp, E.; Kryscinski, W.; Savarese, S.; Zhou, Y.; Xiong, C. Long Document Summarization with Top-Down and Bottom-up Inference. arXiv 2022, arXiv:2203.07586. [Google Scholar]
- Baltrušaitis, T.; Ahuja, C.; Morency, L.-P. Multimodal Machine Learning: A Survey and Taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 423–443. [Google Scholar] [CrossRef] [Green Version]
- Kraus, M.; Angerbauer, K.; Buchmüller, J.; Schweitzer, D.; Keim, D.; Sedlmair, M.; Fuchs, J. Assessing 2D and 3D Heatmaps for Comparative Analysis. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–14. [Google Scholar]
- Haarman, B.C.M.; der Lek, R.R.-V.; Nolen, W.; Mendes, R.; Drexhage, H.; Burger, H. Feature-Expression Heat Maps—A New Visual Method to Explore Complex Associations between Two Variable Sets. J. Biomed. Inform. 2015, 53, 156–161. [Google Scholar] [CrossRef] [Green Version]
- Paun, S.; Carpenter, B.; Chamberlain, J.; Hovy, D.; Kruschwitz, U.; Poesio, M. Comparing Bayesian Models of Annotation. Trans. Assoc. Comput. Linguist. 2018, 6, 571–585. [Google Scholar] [CrossRef]
- Thaler, F.; Payer, C.; Urschler, M.; Štern, D. Modeling Annotation Uncertainty with Gaussian Heatmaps in Landmark Localization. arXiv 2021, arXiv:2109.09533. [Google Scholar]
- Sun, Z.-H.; Jia, K.-B. Image Annotation and Refinement with Markov Chain Model of Visual Keywords and the Semantics. In Intelligent Data Analysis and Its Applications; Springer: New York, NY, USA, 2014; pp. 375–384. [Google Scholar]
- Reher, R.; Kim, H.; Zhang, C.; Mao, H.; Wang, M.; Nothias, L.-F.; Caraballo-Rodriguez, A.; Glukhov, E.; Teke, B.; Leao, T.; et al. A Convolutional Neural Network-Based Approach for the Rapid Annotation of Molecularly Diverse Natural Products. J. Am. Chem. Soc. 2020, 142, 4114–4120. [Google Scholar] [CrossRef] [PubMed]
- Li, Z.; Xu, Z.; Zhang, R.; Zou, H.; Gao, F. Design of Modified 2-Degree-of-Freedom Proportional–Integral–Derivative Controller for Unstable Processes. Meas. Control 2020, 53, 1465–1471. [Google Scholar] [CrossRef]
- Schwartz, B.; Ward, A. Doing Better but Feeling Worse: The Paradox of Choice. In Positive Psychology in Practice; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2004; pp. 86–104. [Google Scholar]
- Luccioni, A.S.; Rolnick, D. Bugs in the Data: How Imagenet Misrepresents Biodiversity. arXiv 2022, arXiv:2208.11695. [Google Scholar]
- Mitchell, E.; Lin, C.; Bosselut, A.; Finn, C.; Manning, C. Fast Model Editing at Scale. arXiv 2022, arXiv:2110.11309. [Google Scholar]
- Juneja, J.; Bansal, R.; Cho, K.; Sedoc, J.; Saphra, N. Linear Connectivity Reveals Generalization Strategies. arXiv 2022, arXiv:2205.12411. [Google Scholar]
- Ainsworth, S.K.; Hayase, J.; Srinivasa, S. Git Re-Basin: Merging Models Modulo Permutation Symmetries. arXiv 2022, arXiv:2209.04836. [Google Scholar]
- Ainsworth, S.K.; Foti, N.; Fox, E. Disentangled Vae Representations for Multi-Aspect and Missing Data. arXiv 2018, arXiv:1806.09060. [Google Scholar]
- Jain, V.; Chaudhary, G.; Luthra, N.; Rao, A.; Walia, S. Dynamic Handwritten Signature and Machine Learning Based Identity Verification for Keyless Cryptocurrency Transactions. J. Discret. Math. Sci. Cryptogr. 2019, 22, 191–202. [Google Scholar] [CrossRef]
- Cheung, B.; Terekhov, A.; Chen, Y.; Agrawal, P.; Olshausen, B. Superposition of Many Models into One. arXiv 2019, arXiv:1902.05522. [Google Scholar]
- Chen, A.M.; Lu, H.-m.; Hecht-Nielsen, R. On the Geometry of Feedforward Neural Network Error Surfaces. Neural Comput. 1993, 5, 910–927. [Google Scholar] [CrossRef]
- Transformer Circuits Thread: Toy Models of Superposition. Available online: https://transformer-circuits.pub/2022/toy_model/index.html (accessed on 18 October 2022).
- Simon Willison’s Weblog: Prompt Injection Attacks against GPT-3. Available online: https://simonwillison.net/2022/Sep/12/prompt-injection/ (accessed on 18 October 2022).
- Gandelsman, Y.; Sun, Y.; Chen, X.; Efros, A. Test-Time Training with Masked Autoencoders. arXiv 2022, arXiv:2209.07522. [Google Scholar]
- Exploring 12 Million of the 2.3 Billion Images Used to Train Stable Diffusion’s Image Generator. Available online: https://waxy.org/2022/08/exploring-12-million-of-the-images-used-to-train-stable-diffusions-image-generator/ (accessed on 18 October 2022).
- Artist Finds Private Medical Record Photos in Popular Ai Training Data Set. Available online: https://arstechnica.com/information-technology/2022/09/artist-finds-private-medical-record-photos-in-popular-ai-training-data-set/ (accessed on 18 October 2022).
- GitHub: Your Ai Pair Programmer. Available online: https://github.com/features/copilot (accessed on 18 October 2022).
- Nijkamp, E.; Pang, B.; Hayashi, H.; Tu, L.; Wang, H.; Zhou, Y.; Savarese, S.; Xiong, C. Codegen: An Open Large Language Model for Code with Multi-Turn Program Synthesis. arXiv 2022, arXiv:2203.13474. [Google Scholar]
- CodeGeeX: A Multilingual Code Generation Model. Available online: http://keg.cs.tsinghua.edu.cn/codegeex/ (accessed on 18 October 2022).
- Christopoulou, F.; Lampouras, G.; Gritta, M.; Zhang, G.; Guo, Y.; Li, Z.-Y.; Zhang, Q.; Xiao, M.; Shen, B.; Li, L.; et al. Pangu-Coder: Program Synthesis with Function-Level Language Modeling. arXiv 2022, arXiv:2207.11280. [Google Scholar]
- Simon Willison’s Weblog: Using GPT-3 to Explain How Code Works. Available online: https://simonwillison.net/2022/Jul/9/gpt-3-explain-code (accessed on 18 October 2022).
- Haluptzok, P.M.; Bowers, M.; Kalai, A. Language Models Can Teach Themselves to Program Better. arXiv 2022, arXiv:2207.14502. [Google Scholar]
- Bavarian, M.; Jun, H.; Tezak, N.; Schulman, J.; McLeavey, C.; Tworek, J.; Chen, M. Efficient Training of Language Models to Fill in the Middle. arXiv 2022, arXiv:2207.14255. [Google Scholar]
- Risko, E.F.; Foulsham, T.; Dawson, S.; Kingstone, A. The Collaborative Lecture Annotation System (Clas): A New Tool for Distributed Learning. IEEE Trans. Learn. Technol. 2013, 6, 4–13. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Sultana, F.; Sufian, A.; Dutta, P. Evolution of Image Segmentation Using Deep Convolutional Neural Network: A Survey. arXiv 2020, arXiv:abs/2001.04074. [Google Scholar] [CrossRef]
- Li, Y.; Wei, J.; Liu, Y.; Kauttonen, J.; Zhao, G. Deep Learning for Micro-Expression Recognition: A Survey. IEEE Trans. Affect. Comput. 2022, 13, 2028–2046. [Google Scholar] [CrossRef]
- Andersen, P.H.; Broomé, S.; Rashid, M.; Lundblad, J.; Ask, K.; Li, Z.; Hernlund, E.; Rhodin, M.; Kjellström, H. Towards Machine Recognition of Facial Expressions of Pain in Horses. Animals 2021, 11, 1643. [Google Scholar] [CrossRef] [PubMed]
- Boneh-Shitrit, T.; Amir, S.; Bremhorst, A.; Mills, D.; Riemer, S.; Fried, D.; Zamansky, A. Deep Learning Models for Automated Classification of Dog Emotional States from Facial Expressions. arXiv 2022, arXiv:2206.05619. [Google Scholar]
- Rubinstein, M. Analysis and Visualization of Temporal Variations in Video. Doctoral Dissertation, Massachusetts Institute of Technology, Cambridge, MA, USA, 2014. [Google Scholar]
- Ideas Ai Home Page. Available online: https://ideasai.com (accessed on 18 October 2022).
- Twitter Page: Simon Willison. Available online: https://twitter.com/simonw/status/1555626060384911360 (accessed on 18 October 2022).
- Flexible Diffusion Modeling of Long Videos. Available online: https://plai.cs.ubc.ca/2022/05/20/flexible-diffusion-modeling-of-long-videos/ (accessed on 18 October 2022).
- Li, Z.; Wang, Q.; Snavely, N.; Kanazawa, A. Infinitenature-Zero: Learning Perpetual View Generation of Natural Scenes from Single Images. arXiv 2022, arXiv:2207.11148. [Google Scholar]
- Barron, J.T.; Mildenhall, B.; Verbin, D.; Srinivasan, P.; Hedman, P. Mip-Nerf 360: Unbounded Anti-Aliased Neural Radiance Fields. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 5460–5469. [Google Scholar]
- Elicit: Language Models as Research Assistants. Available online: https://www.alignmentforum.org/posts/s5jrfbsGLyEexh4GT/elicit-language-models-as-research-assistants (accessed on 18 October 2022).
- Archive.Today: @Michaeltefula. Available online: https://archive.ph/9eiPn#selection-2773.0-2775.13 (accessed on 18 October 2022).
- Jargon Home Page. Available online: https://explainjargon.com/ (accessed on 18 October 2022).
- Promptbase: Dall·E, GPT-3, Midjourney, Stable Diffusion Prompt Marketplace. Available online: https://promptbase.com/ (accessed on 19 October 2022).
- The Dall·E 2 Prompt Book. Available online: http://dallery.gallery/the-dalle-2-prompt-book/ (accessed on 19 October 2022).
- Lexica: The Stable Diffusion Search Engine. Available online: https://lexica.art/ (accessed on 19 October 2022).
- Belay Labs: Introducing GPT Explorer. Available online: https://belay-labs.github.io/gpt-explorer/introducing-gpt-explorer.html (accessed on 19 October 2022).
- Imagine Prompter Guide. Available online: https://prompterguide.com/ (accessed on 19 October 2022).
- Promptomania. Available online: https://promptomania.com/ (accessed on 19 October 2022).
- Clip Interrogator. Available online: https://huggingface.co/spaces/pharma/CLIP-Interrogator (accessed on 23 October 2022).
- Arora, S.; Narayan, A.; Chen, M.; Orr, L.; Guha, N.; Bhatia, K.; Chami, I.; Sala, F.; Ré, C. Ask Me Anything: A Simple Strategy for Prompting Language Models. arXiv 2022, arXiv:2210.02441. [Google Scholar]
- Press, O.; Zhang, M.; Min, S.; Schmidt, L.; Smith, N.; Lewis, M. Measuring and Narrowing the Compositionality Gap in Language Models. arXiv 2022, arXiv:2210.03350. [Google Scholar]
- Jiang, Y.; Gupta, A.; Zhang, Z.; Wang, G.; Dou, Y.; Chen, Y.; Fei-Fei, L.; Anandkumar, A.; Zhu, Y.; Fan, L. Vima: General Robot Manipulation with Multimodal Prompts. arXiv 2022, arXiv:2210.03094. [Google Scholar]
- Ahn, M.; Brohan, A.; Brown, N.; Chebotar, Y.; Cortes, O.; David, B.; Finn, C.; Gopalakrishnan, K.; Hausman, K.; Herzog, A.; et al. Do as I Can, Not as I Say: Grounding Language in Robotic Affordances. arXiv 2022, arXiv:2204.01691. [Google Scholar]
- Zeng, A.; Wong, A.; Welker, S.; Choromanski, K.; Tombari, F.; Purohit, A.; Ryoo, M.; Sindhwani, V.; Lee, J.; Vanhoucke, V.; et al. Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language. arXiv 2022, arXiv:2204.00598. [Google Scholar]
- Huang, W.; Abbeel, P.; Pathak, D.; Mordatch, I. Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents. arXiv 2022, arXiv:2201.07207. [Google Scholar]
- Shah, D.; Osinski, B.; Ichter, B.; Levine, S. LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action. arXiv 2022, arXiv:2207.04429. [Google Scholar]
- Huang, W.; Xia, F.; Xiao, T.; Chan, H.; Liang, J.; Florence, P.; Zeng, A.; Tompson, J.; Mordatch, I.; Chebotar, Y.; et al. Inner Monologue: Embodied Reasoning through Planning with Language Models. arXiv 2022, arXiv:2207.05608. [Google Scholar]
- Kant, Y.; Ramachandran, A.; Yenamandra, S.; Gilitschenski, I.; Batra, D.; Szot, A.; Agrawal, H. Housekeep: Tidying Virtual Households Using Commonsense Reasoning. arXiv 2022, arXiv:2205.10712. [Google Scholar]
- Li, S.; Puig, X.; Du, Y.; Wang, C.; Akyürek, E.; Torralba, A.; Andreas, J.; Mordatch, I. Pre-Trained Language Models for Interactive Decision-Making. arXiv 2022, arXiv:2202.01771. [Google Scholar]
- Bucker, A.F.C.; Figueredo, L.; Haddadin, S.; Kapoor, A.; Ma, S.; Vemprala, S.; Bonatti, R. Latte: Language Trajectory Transformer. arXiv 2022, arXiv:2208.02918. [Google Scholar]
- Cui, Y.; Niekum, S.; Gupta, A.; Kumar, V.; Rajeswaran, A. Can Foundation Models Perform Zero-Shot Task Specification for Robot Manipulation? arXiv 2022, arXiv:2204.11134. [Google Scholar]
- Tam, A.C.; Rabinowitz, N.; Lampinen, A.; Roy, N.; Chan, S.; Strouse, D.; Wang, J.; Banino, A.; Hill, F. Semantic Exploration from Language Abstractions and Pretrained Representations. arXiv 2022, arXiv:2204.05080. [Google Scholar]
- Khandelwal, A.; Weihs, L.; Mottaghi, R.; Kembhavi, A. Simple but Effective: Clip Embeddings for Embodied Ai. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 14809–14818. [Google Scholar]
- Shridhar, M.; Manuelli, L.; Fox, D. Cliport: What and Where Pathways for Robotic Manipulation. arXiv 2021, arXiv:2109.12098. [Google Scholar]
- Lin, B.; Zhu, Y.; Chen, Z.; Liang, X.; Liu, J.-Z.; Liang, X. Adapt: Vision-Language Navigation with Modality-Aligned Action Prompts. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 15375–15385. [Google Scholar]
- Parisi, S.; Rajeswaran, A.; Purushwalkam, S.; Gupta, A. The Unsurprising Effectiveness of Pre-Trained Vision Models for Control. arXiv 2022, arXiv:2203.03580. [Google Scholar]
- Gadre, S.Y.; Wortsman, M.; Ilharco, G.; Schmidt, L.; Song, S. Clip on Wheels: Zero-Shot Object Navigation as Object Localization and Exploration. arXiv 2022, arXiv:2203.10421. [Google Scholar]
- Hong, Y.; Wu, Q.; Qi, Y.; Rodriguez-Opazo, C.; Gould, S. Vln Bert: A Recurrent Vision-and-Language Bert for Navigation. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 1643–1653. [Google Scholar]
- Majumdar, A.; Shrivastava, A.; Lee, S.; Anderson, P.; Parikh, D.; Batra, D. Improving Vision-and-Language Navigation with Image-Text Pairs from the Web. arXiv 2020, arXiv:2004.14973. [Google Scholar]
- Waymo: Simulation City: Introducing Waymo’s Most Advanced Simulation System yet for Autonomous Driving. Available online: https://blog.waymo.com/2021/06/SimulationCity.html (accessed on 19 October 2022).
- Liu, P.; Yuan, W.; Fu, J.; Jiang, Z.; Hayashi, H.; Neubig, G. Pre-Train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. arXiv 2021, arXiv:2107.13586. [Google Scholar] [CrossRef]
- Prompting: Better Ways of Using Language Models for NLP Tasks. Available online: https://thegradient.pub/prompting/ (accessed on 19 October 2022).
- Nerd for Tech: Prompt Engineering: The Career of Future. Available online: https://medium.com/nerd-for-tech/prompt-engineering-the-career-of-future-2fb93f90f117 (accessed on 19 October 2022).
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North, Skopje, North Macedonia, 10–12 June 2019; pp. 4171–4186. [Google Scholar]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language Models Are Few-Shot Learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
- Min, S.; Lyu, X.; Holtzman, A.; Artetxe, M.; Lewis, M.; Hajishirzi, H.; Zettlemoyer, L. Rethinking the Role of Demonstrations: What Makes in-Context Learning Work? arXiv 2022, arXiv:2202.12837. [Google Scholar]
- Garg, S.; Tsipras, D.; Liang, P.; Valiant, G. What Can Transformers Learn in-Context? A Case Study of Simple Function Classes. arXiv 2022, arXiv:2208.01066. [Google Scholar]
- Towards Data Science: Almost No Data and No Time? Unlocking the True Potential of GPT3, a Case Study. Available online: https://towardsdatascience.com/almost-no-data-and-no-time-unlocking-the-true-potential-of-gpt3-a-case-study-b4710ca0614a (accessed on 19 October 2022).
- Twitter Post of Gene Kogan: Desert Landscape at Sunrise in Studio Ghibli Style. Available online: https://twitter.com/genekogan/status/1512513827031580673 (accessed on 19 October 2022).
- Bai, Y.; Jones, A.; Ndousse, K.; Askell, A.; Chen, A.; DasSarma, N.; Drain, D.; Fort, S.; Ganguli, D.; Henighan, T.; et al. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback. arXiv 2022, arXiv:2204.05862. [Google Scholar]
- Roboflow: Experimenting with CLIP+VQGAN to Create AI Generated Art. Available online: https://blog.roboflow.com/ai-generated-art/ (accessed on 19 October 2022).
- White, A.D.; Hocky, G.; Gandhi, H.; Ansari, M.; Cox, S.; Wellawatte, G.; Sasmal, S.; Yang, Z.; Liu, K.; Singh, Y.; et al. Do Large Language Models Know Chemistry? ChemRxiv Preprint; Cambridge Open Engage: Cambridge, UK, 2022. [CrossRef]
- Twitter Post of Riley Goodside from 15 April 2022. Available online: https://twitter.com/goodside/status/1515128035439255553 (accessed on 19 October 2022).
- Li, Y.; Lin, Z.; Zhang, S.; Fu, Q.; Chen, B.; Lou, J.-G.; Chen, W. On the Advance of Making Language Models Better Reasoners. arXiv 2022, arXiv:2206.02336. [Google Scholar]
- Twitter Post of Cuddlysalmon: Decided to Try a GPT-3/Dall E Crossover Experiment Today. Available online: https://twitter.com/nptacek/status/1548402120075800577 (accessed on 19 October 2022).
- Thread Reader: User Magnus Petersen. Available online: https://threadreaderapp.com/thread/1564633854119477257.html (accessed on 19 October 2022).
- Daras, G.; Dimakis, A. Discovering the Hidden Vocabulary of Dalle-2. arXiv 2022, arXiv:2206.00169. [Google Scholar]
- Introducing the World’s Largest Open Multilingual Language Model: Bloom. Available online: https://bigscience.huggingface.co/blog/bloom (accessed on 19 October 2022).
- GLM-130B: An Open Bilingual Pre-Trained Model. Available online: http://keg.cs.tsinghua.edu.cn/glm-130b/posts/glm-130b/ (accessed on 19 October 2022).
- Dohan, D.; Xu, W.; Lewkowycz, A.; Austin, J.; Bieber, D.; Lopes, R.; Wu, Y.; Michalewski, H.; Saurous, R.; Sohl-Dickstein, J.; et al. Language Model Cascades. arXiv 2022, arXiv:2207.10342. [Google Scholar]
- Argyle, L.P.; Busby, E.; Fulda, N.; Gubler, J.; Rytting, C.; Wingate, D. Out of One, Many: Using Language Models to Simulate Human Samples. arXiv 2022, arXiv:2209.06899. [Google Scholar]
- Aher, G.; Arriaga, R.; Kalai, A. Using Large Language Models to Simulate Multiple Humans. arXiv 2022, arXiv:2208.10264. [Google Scholar]
- Borgeaud, S.; Mensch, A.; Hoffmann, J.; Cai, T.; Rutherford, E.; Millican, K.; Driessche, G.; Lespiau, J.-B.; Damoc, B.; Clark, A.; et al. Improving Language Models by Retrieving from Trillions of Tokens. arXiv 2021, arXiv:2112.04426. [Google Scholar]
- Izacard, G.; Lewis, P.; Lomeli, M.; Hosseini, L.; Petroni, F.; Schick, T.; Yu, J.; Joulin, A.; Riedel, S.; Grave, E. Few-Shot Learning with Retrieval Augmented Language Models. arXiv 2022, arXiv:2208.03299. [Google Scholar]
- Tay, Y.; Wei, J.; Chung, H.; Tran, V.; So, D.; Shakeri, S.; Garcia, X.; Zheng, H.; Rao, J.; Chowdhery, A.; et al. Transcending Scaling Laws with 0.1% Extra Compute. arXiv 2022, arXiv:2210.11399. [Google Scholar]
- Chung, H.W.; Hou, L.; Longpre, S.; Zoph, B.; Tay, Y.; Fedus, W.; Li, E.; Wang, X.; Dehghani, M.; Brahma, S.; et al. Scaling Instruction-Finetuned Language Models. arXiv 2022, arXiv:2210.11416. [Google Scholar]
- Castricato, L.; Havrilla, A.; Matiana, S.; Pieler, M.; Ye, A.; Yang, I.; Frazier, S.; Riedl, M. Robust Preference Learning for Storytelling Via Contrastive Reinforcement Learning. arXiv 2022, arXiv:2210.07792. [Google Scholar]
- Ai-Written Critiques Help Humans Notice Flaws. Available online: https://openai.com/blog/critiques/ (accessed on 19 October 2022).
- Tech Xplore: Researchers Develop a Method to Keep Bots from Using Toxic Language. Available online: https://techxplore.com/news/2022-04-method-bots-toxic-language.html (accessed on 19 October 2022).
- Ho, J.; Salimans, T.; Gritsenko, A.; Chan, W.; Norouzi, M.; Fleet, D. Video Diffusion Models. arXiv 2022, arXiv:2204.03458. [Google Scholar]
- Googleplay App: Tapcaption—Ai Captions. Available online: https://play.google.com/store/apps/details?id=com.tapcaption (accessed on 19 October 2022).
- Soltan, S.; Ananthakrishnan, S.; FitzGerald, J.; Gupta, R.; Hamza, W.; Khan, H.; Peris, C.; Rawls, S.; Rosenbaum, A.; Rumshisky, A.; et al. Alexatm 20b: Few-Shot Learning Using a Large-Scale Multilingual Seq2seq Model. arXiv 2022, arXiv:2208.01448. [Google Scholar]
- Lotf, H.; Ramdani, M. Multi-Label Classification. In Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications, New York, NY, USA, 23–24 September 2020; pp. 1–6. [Google Scholar]
- Read, J.; Martino, L.; Olmos, P.; Luengo, D. Scalable Multi-Output Label Prediction: From Classifier Chains to Classifier Trellises. Pattern Recognit. 2015, 48, 2096–2109. [Google Scholar] [CrossRef] [Green Version]
- Shi, W.; Yu, D.; Yu, Q. A Gaussian Process-Bayesian Bernoulli Mixture Model for Multi-Label Active Learning. In Proceedings of the NeurIPS, Online, 6–14 December 2021. [Google Scholar]
- Yao, S.; Zhao, J.; Yu, D.; Du, N.; Shafran, I.; Narasimhan, K.; Cao, Y. React: Synergizing Reasoning and Acting in Language Models. arXiv 2022, arXiv:2210.03629. [Google Scholar]
- Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance. Available online: https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html (accessed on 20 October 2022).
- Zelikman, E.; Wu, Y.; Goodman, N. Star: Bootstrapping Reasoning with Reasoning. arXiv 2022, arXiv:2203.14465. [Google Scholar]
- Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Chi, E.; Le, Q.; Zhou, D. Chain of Thought Prompting Elicits Reasoning in Large Language Models. arXiv 2022, arXiv:2201.11903. [Google Scholar]
- Thoppilan, R.; De Freitas, D.; Hall, J.; Shazeer, N.; Kulshreshtha, A.; Cheng, H.; Jin, A.; Bos, T.; Baker, L.; Du, Y.; et al. LaMDA: Language Models for Dialog Applications. arXiv 2022, arXiv:2201.08239. [Google Scholar]
- Shi, F.; Suzgun, M.; Freitag, M.; Wang, X.; Srivats, S.; Vosoughi, S.; Chung, H.; Tay, Y.; Ruder, S.; Zhou, D.; et al. Language Models Are Multilingual Chain-of-Thought Reasoners. arXiv 2022, arXiv:2210.03057. [Google Scholar]
- Kojima, T.; Gu, S.; Reid, M.; Matsuo, Y.; Iwasawa, Y. Large Language Models Are Zero-Shot Reasoners. arXiv 2022, arXiv:2205.11916. [Google Scholar]
- Zhou, D.; Scharli, N.; Hou, L.; Wei, J.; Scales, N.; Wang, X.; Schuurmans, D.; Bousquet, O.; Le, Q.; Chi, E. Least-to-Most Prompting Enables Complex Reasoning in Large Language Models. arXiv 2022, arXiv:2205.10625. [Google Scholar]
- Aligning Language Models to Follow Instructions. Available online: https://openai.com/blog/instruction-following/ (accessed on 20 October 2022).
- New GPT3 Impressive Capabilities—Instructgpt3. Available online: https://www.lesswrong.com/posts/dypAjfRCe4nyasGSs/new-gpt3-impressive-capabilities-instructgpt3-1-2 (accessed on 22 October 2022).
- Learning to Summarize with Human Feedback. Available online: https://openai.com/blog/learning-to-summarize-with-human-feedback/ (accessed on 22 October 2022).
- BlenderBot 3: A 175b Parameter, Publicly Available Chatbot That Improves Its Skills and Safety over Time. Available online: https://ai.facebook.com/blog/blenderbot-3-a-175b-parameter-publicly-available-chatbot-that-improves-its-skills-and-safety-over-time (accessed on 22 October 2022).
- Scheurer, J.E.E.; Campos, J.; Chan, J.; Chen, A.; Cho, K.; Perez, E. Training Language Models with Language Feedback. arXiv 2022, arXiv:2204.14146. [Google Scholar]
- YouTube: Learning from Natural Language Feedback. Available online: https://www.youtube.com/watch?v=oEnyl9dMKCc (accessed on 22 October 2022).
- Deep Mind: Robust Real-Time Cultural Transmission without Human Data Supplementary Material. Available online: https://sites.google.com/view/dm-cgi (accessed on 22 October 2022).
- Aghajanyan, A.; Huang, B.; Ross, C.; Karpukhin, V.; Xu, H.; Goyal, N.; Okhonko, D.; Joshi, M.; Ghosh, G.; Lewis, M.; et al. Cm3: A Causal Masked Multimodal Model of the Internet. arXiv 2022, arXiv:2201.07520. [Google Scholar]
- Singh, A.; Hu, R.; Goswami, V.; Couairon, G.; Galuba, W.; Rohrbach, M.; Kiela, D. Flava: A Foundational Language and Vision Alignment Model. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 15617–15629. [Google Scholar]
- Almost No Data and No Time? Unlock the True Potential of GPT3! Available online: https://www.waylay.io/articles/nlp-case-study-by-waylay (accessed on 22 October 2022).
- DeepMind: Melting Pot. Available online: https://github.com/deepmind/meltingpot (accessed on 22 October 2022).
- Imitate and Repurpose: Learning Reusable Robot Movement Skills from Human and Animal Behaviors. Available online: https://sites.google.com/view/robot-npmp (accessed on 22 October 2022).
- Armstrong, S.; Mindermann, S. Occam’s Razor Is Insufficient to Infer the Preferences of Irrational Agents. arXiv 2017, arXiv:1712.05812. [Google Scholar]
- WebGPT: Improving the Factual Accuracy of Language Models through Web Browsing. Available online: https://openai.com/blog/webgpt/ (accessed on 22 October 2022).
- Rae, J.W.; Borgeaud, S.; Cai, T.; Millican, K.; Hoffmann, J.; Song, F.; Aslanides, J.; Henderson, S.; Ring, R.; Young, S.; et al. Scaling Language Models: Methods, Analysis & Insights from Training Gopher. arXiv 2021, arXiv:2112.11446. [Google Scholar]
- Language Modelling at Scale: Gopher, Ethical Considerations, and Retrieval. Available online: https://www.deepmind.com/blog/language-modelling-at-scale-gopher-ethical-considerations-and-retrieval (accessed on 22 October 2022).
- Contextual Rephrasing in Google Assistant. Available online: https://ai.googleblog.com/2022/05/contextual-rephrasing-in-google.html (accessed on 22 October 2022).
- Wu, Y.; Rabe, M.; Hutchins, D.; Szegedy, C. Memorizing Transformers. arXiv 2022, arXiv:2203.08913. [Google Scholar]
- Lehman, J.; Gordon, J.; Jain, S.; Ndousse, K.; Yeh, C.; Stanley, K. Evolution through Large Models. arXiv 2022, arXiv:2206.08896. [Google Scholar]
- Guo, Z.D.; Thakoor, S.; Pislar, M.; Pires, B.; Altch’e, F.; Tallec, C.; Saade, A.; Calandriello, D.; Grill, J.-B.; Tang, Y.; et al. Byol-Explore: Exploration by Bootstrapped Prediction. arXiv 2022, arXiv:2206.08332. [Google Scholar]
- Sorscher, B.; Geirhos, R.; Shekhar, S.; Ganguli, S.; Morcos, A. Beyond Neural Scaling Laws: Beating Power Law Scaling Via Data Pruning. arXiv 2022, arXiv:2206.14486. [Google Scholar]
- Stability Ai: Stable Diffusion Public Release. Available online: https://stability.ai/blog/stable-diffusion-public-release (accessed on 22 October 2022).
- Compressing Global Illumination with Neural Networks. Available online: https://juretriglav.si/compressing-global-illumination-with-neural-networks/ (accessed on 22 October 2022).
- Stable Diffusion Based Image Compression. Available online: https://pub.towardsai.net/stable-diffusion-based-image-compresssion-6f1f0a399202 (accessed on 22 October 2022).
- Nvidia Maxine. Available online: https://developer.nvidia.com/maxine (accessed on 22 October 2022).
- Anil, C.; Wu, Y.; Andreassen, A.; Lewkowycz, A.; Misra, V.; Ramasesh, V.; Slone, A.; Gur-Ari, G.; Dyer, E.; Neyshabur, B. Exploring Length Generalization in Large Language Models. arXiv 2022, arXiv:2207.04901. [Google Scholar]
- Dąbrowska, E. What Exactly Is Universal Grammar, and Has Anyone Seen It? Front. Psychol. 2015, 6, 852. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Transformer Language Models Are Doing Something More General. Available online: https://www.lesswrong.com/posts/YwqSijHybF9GFkDab/transformer-language-models-are-doing-something-more-general (accessed on 22 October 2022).
- Eight Ways You Can Get More Enjoyment from the Same Activity. Available online: https://www.spencergreenberg.com/2021/02/eight-ways-you-can-get-more-enjoyment-from-the-same-activity/ (accessed on 22 October 2022).
- Six a/B Tests Used by Duolingo to Tap into Habit-Forming Behaviour. Available online: https://econsultancy.com/six-a-b-tests-used-by-duolingo-to-tap-into-habit-forming-behaviour/ (accessed on 22 October 2022).
- The Snapchat Streak: Brilliant Marketing, Destructive Social Results. Available online: https://theboar.org/2019/11/the-snapchat-streak-brilliant-marketing-destructive-social-results/ (accessed on 22 October 2022).
- I Think It’s Time to Give up My Duolingo Streak. Available online: https://debugger.medium.com/i-think-its-time-to-give-up-my-duolingo-streak-81c27ff1be8b (accessed on 22 October 2022).
- Sabou, M.; Bontcheva, K.; Derczynski, L.; Scharl, A. Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC’14), Reykjavik, Iceland, 26–31 May 2014. [Google Scholar]
- Wang, A.; Hoang, C.; Kan, M.-Y. Perspectives on Crowdsourcing Annotations for Natural Language Processing. Lang. Resour. Eval. 2012, 47, 9–31. [Google Scholar] [CrossRef] [Green Version]
- What Is 4chan? Available online: https://www.4chan.org/ (accessed on 22 October 2022).
- How Asynchronous Online in ‘Death Stranding’ Brings Players Together. Available online: https://goombastomp.com/asynchronous-death-stranding (accessed on 22 October 2022).
- Sucholutsky, I.; Schonlau, M. ‘Less Than One’-Shot Learning: Learning N Classes from M < N Samples. arXiv 2020, arXiv:2009.08449. [Google Scholar]
- Hudson, D.A.; Zitnick, C. Generative Adversarial Transformers. arXiv 2021, arXiv:2103.01209. [Google Scholar]
- Yoon, J.; Jordon, J.; Schaar, M. Gain: Missing Data Imputation Using Generative Adversarial Nets. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 5689–5698. [Google Scholar]
- Jarrett, D.; Cebere, B.; Liu, T.; Curth, A.; Schaar, M. Hyperimpute: Generalized Iterative Imputation with Automatic Model Selection. In Proceedings of the 39th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 9916–9937. [Google Scholar]
- Abroshan, M.; Yip, K.; Tekin, C.; Schaar, M. Conservative Policy Construction Using Variational Autoencoders for Logged Data with Missing Values. IEEE Trans. Neural Netw. Learn. Syst. 2022, 1–11. [Google Scholar] [CrossRef]
- Kyono, T.; Zhang, Y.; Bellot, A.; van der Schaar, M. Miracle: Causally-Aware Imputation Via Learning Missing Data Mechanisms. Adv. Neural Inf. Process. Syst. 2021, 34, 23806–23817. [Google Scholar]
- Yoon, J.; Zame, W.; Schaar, M. Estimating Missing Data in Temporal Data Streams Using Multi-Directional Recurrent Neural Networks. IEEE Trans. Biomed. Eng. 2019, 66, 1477–1490. [Google Scholar] [CrossRef] [Green Version]
- Cloud Tpu: Accelerate Machine Learning Models with Google Supercomputers. Available online: https://cloud.google.com/tpu (accessed on 22 October 2022).
- Introducing the Colossus™ MK2 GC200 IPU. Available online: https://www.graphcore.ai/products/ipu (accessed on 22 October 2022).
- Basu, S.K. Chapter 9—A Cursory Look at Parallel Architectures and Biologically Inspired Computing. In Soft Computing and Intelligent Systems; Sinha, N., Gupta, M., Eds.; Academic Press: San Diego, CA, USA, 2000; pp. 185–216. [Google Scholar]
- Bagavathi, C.; Saraniya, O. Chapter 13—Evolutionary Mapping Techniques for Systolic Computing System. In Deep Learning and Parallel Computing Environment for Bioengineering Systems; Sangaiah, A., Ed.; Academic Press: San Diego, CA, USA, 2019; pp. 207–223. [Google Scholar]
- Narayanan, S.; Georgiou, P. Behavioral Signal Processing: Deriving Human Behavioral Informatics from Speech and Language: Computational Techniques Are Presented to Analyze and Model Expressed and Perceived Human Behavior-Variedly Characterized as Typical, Atypical, Distressed, and Disordered-from Speech and Language Cues and Their Applications in Health, Commerce, Education, and Beyond. Proc. IEEE Inst. Electr. Electron. Eng. 2013, 101, 1203–1233. [Google Scholar] [CrossRef] [Green Version]
- Hancock, B.; Bringmann, M.; Varma, P.; Liang, P.; Wang, S.; Re, C. Training Classifiers with Natural Language Explanations. Proc. Conf. Assoc. Comput. Linguist. Meet 2018, 2018, 1884–1895. [Google Scholar]
- Anderson, M.; Anderson, S. Geneth: A General Ethical Dilemma Analyzer. Paladyn J. Behav. Robot. 2018, 9, 337–357. [Google Scholar] [CrossRef]
- Gorwa, R.; Binns, R.; Katzenbach, C. Algorithmic Content Moderation: Technical and Political Challenges in the Automation of Platform Governance. Big Data Soc. 2020, 7, 2053951719897945. [Google Scholar] [CrossRef] [Green Version]
- Llanso, E. Artificial Intelligence, Content Moderation, and Freedom of Expression; Transatlantic Working Group on Content Moderation Online and Freedom of Expression, Institute for Information Law: Amsterdam, The Netherlands, 2020. [Google Scholar]
- Ofcom: Use of Ai in Online Content Moderation. Available online: https://www.cambridgeconsultants.com/us/insights/whitepaper/ofcom-use-ai-online-content-moderation (accessed on 22 October 2022).
- Rovatsos, M.; Mittelstadt, B.; Koene, A. Landscape Summary: Bias in Algorithmic Decision-Making: What Is Bias in Algorithmic Decision-Making, How Can We Identify It, and How Can We Mitigate It? UK Government: London, UK, 2019. [Google Scholar]
- Palmer, A. Reasoning for the Digital Age; 2020. Available online: https://reasoningforthedigitalage.com/table-of-contents/contextual-relevance-straw-man-red-herring-and-moving-the-goalposts-fallacies/ (accessed on 22 October 2022).
- Talisse, R.; Aikin, S. Two Forms of the Straw Man. Argumentation 2006, 20, 345–352. [Google Scholar] [CrossRef]
- Jiang, L.; Hwang, J.; Bhagavatula, C.; Le Bras, R.; Forbes, M.; Borchardt, J.; Liang, J.; Etzioni, O.; Sap, M.; Choi, Y. Delphi: Towards Machine Ethics and Norms. arXiv 2021, arXiv:2110.07574. [Google Scholar]
- Incident 146: Research Prototype Ai, Delphi, Reportedly Gave Racially Biased Answers on Ethics. Available online: https://incidentdatabase.ai/cite/146 (accessed on 22 October 2022).
- Jiang, L.; Hwang, J.; Bhagavatula, C.; Le Bras, R.; Liang, J.; Dodge, J.; Sakaguchi, K.; Forbes, M.; Borchardt, J.; Gabriel, S.; et al. Can Machines Learn Morality? The Delphi Experiment. arXiv 2021, arXiv:2110.07574. [Google Scholar]
- Ask Delphi. Available online: https://delphi.allenai.org/ (accessed on 22 October 2022).
- Redwood Research’s Current Project. Available online: https://www.alignmentforum.org/posts/k7oxdbNaGATZbtEg3/redwood-research-s-current-project (accessed on 20 October 2022).
- Herokuapp: Talk to Filtered Transformer. Available online: https://rr-data.herokuapp.com/talk-to-filtered-transformer (accessed on 22 October 2022).
- Granitzer, M.; Kroll, M.; Seifert, C.; Rath, A.; Weber, N.; Dietzel, O.; Lindstaedt, S. Analysis of Machine Learning Techniques for Context Extraction. In Proceedings of the 2008 Third International Conference on Digital Information Management, London, UK, 13–16 November 2008; pp. 233–240. [Google Scholar]
- Anjomshoae, S.; Omeiza, D.; Jiang, L. Context-Based Image Explanations for Deep Neural Networks. Image Vis. Comput. 2021, 116, 104310. [Google Scholar] [CrossRef]
- Zhao, Z.Q.; Zheng, P.; Xu, S.; Wu, X. Object Detection with Deep Learning: A Review. IEEE Trans. Neural. Netw. Learn Syst. 2019, 30, 3212–3232. [Google Scholar] [CrossRef] [Green Version]
- Grishman, R.; Sundheim, B. Message Understanding Conference-6: A Brief History. In Proceedings of the 16th Conference on Computational Linguistics, Stroudsburg, PA, USA, 5–9 August 1996. [Google Scholar]
- Nadeau, D.; Sekine, S. A Survey of Named Entity Recognition and Classification. Lingvisticae Investig. 2007, 30, 3–26. [Google Scholar] [CrossRef]
- Prlic, A.; Cunningham, H.; Tablan, V.; Roberts, A.; Bontcheva, K. Getting More out of Biomedical Documents with Gate’s Full Lifecycle Open Source Text Analytics. PLoS Comput. Biol. 2013, 9, e1002854. [Google Scholar] [CrossRef] [Green Version]
- Kwartler, T. The OpenNLP Project, in Text Mining in Practice with R; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2017; pp. 237–269. [Google Scholar]
- Mansouri, A.; Affendey, L.; Mamat, A. Named Entity Recognition Approaches; International Journal of Computer Science and Network Security 8.2 (2008), pp. 339-344.
- Kołcz, A.; Org, A. Chowdhury; Alspector, J. Data Duplication: An Imbalance Problem? 2003. [Google Scholar]
- Haneem, F.; Ali, R.; Kama, N.; Basri, S. Resolving Data Duplication, Inaccuracy and Inconsistency Issues Using Master Data Management; 2017 5th International Conference on Research and Innovation in Information Systems (ICRIIS) 2017; pp. 1–6.
- Zhou, X.; Chen, L. Monitoring Near Duplicates over Video Streams. In Proceedings of the 18th ACM international conference on Multimedia, Firenze, Italy, 25–29 2010; pp. 521–530. [Google Scholar]
- Ciro, J.; Galvez, D.; Schlippe, T.; Kanter, D. Lsh Methods for Data Deduplication in a Wikipedia Artificial Dataset. arXiv 2021, arXiv:2112.11478. [Google Scholar]
- Fröbe, M.; Bevendorff, J.; Reimer, J.; Potthast, M.; Hagen, M. Sampling Bias Due to near-Duplicates in Learning to Rank. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Xi’an, China, 25–30 July 2020; pp. 1997–2000. [Google Scholar]
- Suzuki, I.; Hara, K.; Eizuka, Y. Impact of Duplicating Small Training Data on Gans. In Proceedings of the 10th International Conference on Data Science, Technology and Applications, Paris, France, 6–8 July 2021; pp. 308–315. [Google Scholar]
- Hoque, R.; Chen, L.; Sharma, S.; Dharmarajan, K.; Thananjeyan, B.; Abbeel, P.; Goldberg, K. Fleet-Dagger: Interactive Robot Fleet Learning with Scalable Human Supervision. arXiv 2022, arXiv:2206.14349. [Google Scholar]
- de Laat, P.B. The Use of Software Tools and Autonomous Bots against Vandalism: Eroding Wikipedia’s Moral Order? Ethics Inf. Technol. 2015, 17, 175–188. [Google Scholar] [CrossRef] [Green Version]
- This Machine Kills Trolls. Available online: https://www.theverge.com/2014/2/18/5412636/this-machine-kills-trolls-how-wikipedia-robots-snuff-out-vandalism (accessed on 22 October 2022).
- Teng, F.; Ma, M.; Ma, Z.; Huang, L.; Xiao, M.; Li, X. A Text Annotation Tool with Pre-Annotation Based on Deep Learning. In Knowledge Science, Engineering and Management; Springer: New York, NY, USA, 2019; pp. 440–451. [Google Scholar]
- Ringger, E.; Carmen, M.; Haertel, R.; Seppi, K.; Lonsdale, D.; McClanahan, P.; Carroll, J.; Ellison, N. Assessing the Costs of Machine-Assisted Corpus Annotation through a User Study. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), European Language Resources Association (ELRA); 2008. [Google Scholar]
- Lingren, T.; Deleger, L.; Molnar, K.; Zhai, H.; Meinzen-Derr, J.; Kaiser, M.; Stoutenborough, L.; Li, Q.; Solti, I. Evaluating the Impact of Pre-Annotation on Annotation Speed and Potential Bias: Natural Language Processing Gold Standard Development for Clinical Named Entity Recognition in Clinical Trial Announcements. J. Am. Med. Inform. Assoc. 2013, 21, 406–413. [Google Scholar] [CrossRef] [PubMed]
- Deep Hierarchical Planning from Pixels. Available online: https://ai.googleblog.com/2022/07/deep-hierarchical-planning-from-pixels.html (accessed on 22 October 2022).
- Assran, M.; Caron, M.; Misra, I.; Bojanowski, P.; Bordes, F.; Vincent, P.; Joulin, A.; Rabbat, M.; Ballas, N. Masked Siamese Networks for Label-Efficient Learning. arXiv 2022, arXiv:2204.07141. [Google Scholar]
- ML-Enhanced Code Completion Improves Developer Productivity. Available online: https://ai.googleblog.com/2022/07/ml-enhanced-code-completion-improves.html (accessed on 22 October 2022).
- YouTube: How to Use GPT-3 on Identifying an Answer Is Useful to a Given Question? Available online: https://www.youtube.com/watch?v=5Mwxm8A1tOo (accessed on 22 October 2022).
- How Ai Could Help Make Wikipedia Entries More Accurate. Available online: https://tech.fb.com/artificial-intelligence/2022/07/how-ai-could-help-make-wikipedia-entries-more-accurate/ (accessed on 22 October 2022).
- Kadavath, S.; Conerly, T.; Askell, A.; Henighan, T.; Drain, D.; Perez, E.; Schiefer, N.; Dodds, Z.; DasSarma, N.; Tran-Johnson, E.; et al. Language Models (Mostly) Know What They Know. arXiv 2022, arXiv:2207.05221. [Google Scholar]
- Chen, B.; Kwiatkowski, R.; Vondrick, C.; Lipson, H. Fully Body Visual Self-Modeling of Robot Morphologies. Sci. Robot. 2022, 7, 68. [Google Scholar] [CrossRef]
- Lee, S.; Chung, J.; Yu, Y.; Kim, G.; Breuel, T.; Chechik, G.; Song, Y. Acav100m: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Nashville, TN, USA, 20–25 June 2021; pp. 10254–10264. [Google Scholar]
- Pati, S.; Baid, U.; Zenk, M.; Edwards, B.; Sheller, M.; Reina, G.; Foley, P.; Gruzdev, A.; Martin, J.; Albarqouni, S.; et al. The Federated Tumor Segmentation (Fets) Challenge. arXiv 2021, arXiv:2105.05874. [Google Scholar]
- Abeyruwan, S.; Graesser, L.; D’Ambrosio, D.; Singh, A.; Shankar, A.; Bewley, A.; Sanketi, P. I-Sim2real: Reinforcement Learning of Robotic Policies in Tight Human-Robot Interaction Loops. arXiv 2022, arXiv:2207.06572. [Google Scholar]
- Xie, K.; Wang, T.; Iqbal, U.; Guo, Y.; Fidler, S.; Shkurti, F. Physics-Based Human Motion Estimation and Synthesis from Videos. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Nashville, TN, USA, 20–25 June 2021; pp. 11512–11521. [Google Scholar]
- Tzaban, R.; Mokady, R.; Gal, R.; Bermano, A.; Cohen-Or, D. Stitch It in Time: Gan-Based Facial Editing of Real Videos. arXiv 2022, arXiv:2201.08361. [Google Scholar]
- Fu, J.; Li, S.; Jiang, Y.; Lin, K.-Y.; Qian, C.; Loy, C.; Wu, W.; Liu, Z. Stylegan-Human: A Data-Centric Odyssey of Human Generation. arXiv 2022, arXiv:2204.11823. [Google Scholar]
- How Waabi World Works. Available online: https://waabi.ai/how-waabi-world-works/ (accessed on 22 October 2022).
- Wei, J.; Bosma, M.; Zhao, V.; Guu, K.; Yu, A.; Lester, B.; Du, N.; Dai, A.; Le, Q. Finetuned Language Models Are Zero-Shot Learners. arXiv 2022, arXiv:2109.01652. [Google Scholar]
- Wang, W.; Dong, L.; Cheng, H.; Song, H.; Liu, X.; Yan, X.; Gao, J.; Wei, F. Visually-Augmented Language Modeling. arXiv 2022, arXiv:2205.10178. [Google Scholar]
- Brooks, T.; Hellsten, J.; Aittala, M.; Wang, T.-C.; Aila, T.; Lehtinen, J.; Liu, M.-Y.; Efros, A.; Karras, T. Generating Long Videos of Dynamic Scenes. arXiv 2022, arXiv:2206.03429. [Google Scholar]
- Nash, C.; Carreira, J.; Walker, J.; Barr, I.; Jaegle, A.; Malinowski, M.; Battaglia, P. Transframer: Arbitrary Frame Prediction with Generative Models. arXiv 2022, arXiv:2203.09494. [Google Scholar]
- Dall·E: Introducing Outpainting. Available online: https://openai.com/blog/dall-e-introducing-outpainting/ (accessed on 22 October 2022).
- Li, D.; Wang, S.; Zou, J.; Chang, T.; Nieuwburg, E.; Sun, F.; Kanoulas, E. Paint4poem: A Dataset for Artistic Visualization of Classical Chinese Poems. arXiv 2021, arXiv:2109.11682. [Google Scholar]
- Anonymous. Phenaki: Variable Length, Video Generation from Open Domain Textual Descriptions. OpenReview 2022. Available online: https://openreview.net/pdf?id=vOEXS39nOF (accessed on 23 October 2022).
- Singer, U.; Polyak, A.; Hayes, T.; Yin, X.; An, J.; Zhang, S.; Hu, Q.; Yang, H.; Ashual, O.; Gafni, O.; et al. Make-a-Video: Text-to-Video Generation without Text-Video Data. arXiv 2022, arXiv:2209.14792. [Google Scholar]
- Explore Synthetic Futuring. Available online: https://medium.thirdwaveberlin.com/explore-synthetic-futuring-59819a12c4ee (accessed on 23 October 2022).
- Li, Y.; Panda, R.; Kim, Y.; Chen, C.-F.; Feris, R.; Cox, D.; Vasconcelos, N. Valhalla: Visual Hallucination for Machine Translation. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 5206–5216. [Google Scholar]
- Rahtz, M.; Varma, V.; Kumar, R.; Kenton, Z.; Legg, S.; Leike, J. Safe Deep Rl in 3D Environments Using Human Feedback. arXiv 2022, arXiv:2201.08102. [Google Scholar]
- Axenie, C.; Scherr, W.; Wieder, A.; Torres, A.; Meng, Z.; Du, X.; Sottovia, P.; Foroni, D.; Grossi, M.; Bortoli, S.; et al. Fuzzy Modeling and Inference for Physics-Aware Road Vehicle Driver Behavior Model Calibration. Expert Systems with Applications. 2022. [Google Scholar] [CrossRef]
- Baker, B.; Akkaya, I.; Zhokhov, P.; Huizinga, J.; Tang, J.; Ecoffet, A.; Houghton, B.; Sampedro, R.; Clune, J. Video Pretraining (VPT): Learning to Act by Watching Unlabeled Online Videos. arXiv 2022, arXiv:2206.11795. [Google Scholar]
- Learning to Play Minecraft with Video Pretraining (Vpt). Available online: https://openai.com/blog/vpt/ (accessed on 23 October 2022).
- Su, H.; Kasai, J.; Wu, C.; Shi, W.; Wang, T.; Xin, J.; Zhang, R.; Ostendorf, M.; Zettlemoyer, L.; Smith, N.; et al. Selective Annotation Makes Language Models Better Few-Shot Learners. arXiv 2022, arXiv:2209.01975. [Google Scholar]
- Alaa, A.M.; Breugel, B.; Saveliev, E.; Schaar, M. How Faithful Is Your Synthetic Data? Sample-Level Metrics for Evaluating and Auditing Generative Models. arXiv 2022, arXiv:2102.08921. [Google Scholar]
- Wood, E.; Baltruvsaitis, T.; Hewitt, C.; Dziadzio, S.; Johnson, M.; Estellers, V.; Cashman, T.; Shotton, J. Fake It Till You Make It: Face Analysis in the Wild Using Synthetic Data Alone. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Nashville, TN, USA, 20–25 June 2021; pp. 3661–3671. [Google Scholar]
- Greff, K.; Belletti, F.; Beyer, L.; Doersch, C.; Du, Y.; Duckworth, D.; Fleet, D.; Gnanapragasam, D.; Golemo, F.; Herrmann, C.; et al. Kubric: A Scalable Dataset Generator. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 3739–3751. [Google Scholar]
- Jakesch, M.; Hancock, J.; Naaman, M. Human Heuristics for Ai-Generated Language Are Flawed. arXiv 2022, arXiv:2206.07271. [Google Scholar]
- Hao, Z.; Mallya, A.; Belongie, S.; Liu, M.-Y. GANcraft: Unsupervised 3D Neural Rendering of Minecraft Worlds. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Nashville, TN, USA, 20–25 June 2021; pp. 14052–14062. [Google Scholar]
- Khalid, N.M.; Xie, T.; Belilovsky, E.; Popa, T. Clip-Mesh: Generating Textured Meshes from Text Using Pretrained Image-Text Models; SIGGRAPH Asia. 2022. Available online: https://dl.acm.org/doi/abs/10.1145/3550469.3555392 (accessed on 22 October 2022).
- Sanghi, A.; Chu, H.; Lambourne, J.; Wang, Y.; Cheng, C.-Y.; Fumero, M. Clip-Forge: Towards Zero-Shot Text-to-Shape Generation. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 18582–18592. [Google Scholar]
- Poole, B.; Jain, A.; Barron, J.; Mildenhall, B. Dreamfusion: Text-to-3D Using 2D Diffusion. arXiv 2022, arXiv:2209.14988. [Google Scholar]
- Gao, J.; Shen, T.; Wang, Z.; Chen, W.; Yin, K.; Li, D.; Litany, O.; Gojcic, Z.; Fidler, S. GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images. arXiv 2022, arXiv:2209.11163. [Google Scholar]
- Common Sense Machines, Generating 3D Worlds with CommonSim-1. Available online: https://csm.ai/commonsim-1-generating-3d-worlds/ (accessed on 23 October 2022).
- Cao, J.; Zhao, A.; Zhang, Z. Automatic Image Annotation Method Based on a Convolutional Neural Network with Threshold Optimization. PLoS ONE 2020, 15, e0238956. [Google Scholar] [CrossRef] [PubMed]
- Ranjbar, S.; Singleton, K.; Jackson, P.; Rickertsen, C.; Whitmire, S.; Clark-Swanson, K.; Mitchell, J.; Swanson, K.; Hu, L. A Deep Convolutional Neural Network for Annotation of Magnetic Resonance Imaging Sequence Type. J. Digit. Imaging 2020, 33, 439–446. [Google Scholar] [CrossRef]
- Wang, R.; Xie, Y.; Yang, J.; Xue, L.; Hu, M.; Zhang, Q. Large Scale Automatic Image Annotation Based on Convolutional Neural Network. J. Vis. Commun. Image Represent. 2017, 49, 213–224. [Google Scholar] [CrossRef]
- Chen, Y.; Liu, L.; Tao, J.; Chen, X.; Xia, R.; Zhang, Q.; Xiong, J.; Yang, K.; Xie, J. The Image Annotation Algorithm Using Convolutional Features from Intermediate Layer of Deep Learning. Multim. Tools Appl. 2021, 80, 4237–4261. [Google Scholar] [CrossRef]
- The Illustrated Transformer. Available online: https://jalammar.github.io/illustrated-transformer/ (accessed on 23 October 2022).
- Transformers from Scratch. Available online: https://e2eml.school/transformers.html (accessed on 23 October 2022).
- Transformers for Software Engineers. Available online: https://blog.nelhage.com/post/transformers-for-software-engineers/ (accessed on 23 October 2022).
- AIM. Big Tech & Their Favourite Deep Learning Techniques. In Analytics India Magazine; Analytics India Magazine: Bangalore, Karnataka, 2021. [Google Scholar]
- Phuong, M.; Hutter, M. Formal Algorithms for Transformers. arXiv 2022, arXiv:2207.09238. [Google Scholar]
- Reif, E.; Ippolito, D.; Yuan, A.; Coenen, A.; Callison-Burch, C.; Wei, J. A Recipe for Arbitrary Text Style Transfer with Large Language Models. arXiv 2021, arXiv:2109.03910. [Google Scholar]
- Jang, E. Just Ask for Generalization; 2021. Available online: https://evjang.com/2021/10/23/generalization.html/ (accessed on 23 October 2022).
- Prompt Engineering. Available online: https://docs.cohere.ai/prompt-engineering-wiki/ (accessed on 23 October 2022).
- Will Transformers Take over Artificial Intelligence? Available online: https://www.quantamagazine.org/will-transformers-take-over-artificial-intelligence-20220310 (accessed on 23 October 2022).
- Srivastava, A.; Rastogi, A.; Rao, A.; Shoeb, A.; Abid, A.; Fisch, A.; Brown, A.; Santoro, A.; Gupta, A.; Garriga-Alonso, A.; et al. Beyond the Imitation Game: Quantifying and Extrapolating the Capabilities of Language Models. arXiv 2022, arXiv:2206.04615. [Google Scholar]
- Branwen, G. The Scaling Hypothesis. 2021. Available online: https://www.gwern.net/Scaling-hypothesis/ (accessed on 23 October 2022).
- Alabdulmohsin, I.M.; Neyshabur, B.; Zhai, X. Revisiting Neural Scaling Laws in Language and Vision. arXiv 2022, arXiv:2209.06640. [Google Scholar]
- Austin, J.; Odena, A.; Nye, M.; Bosma, M.; Michalewski, H.; Dohan, D.; Jiang, E.; Cai, C.; Terry, M.; Le, Q.; et al. Program Synthesis with Large Language Models. arXiv 2021, arXiv:2108.07732. [Google Scholar]
- Lewkowycz, A.; Andreassen, A.; Dohan, D.; Dyer, E.; Michalewski, H.; Ramasesh, V.; Slone, A.; Anil, C.; Schlag, I.; Gutman-Solo, T.; et al. Solving Quantitative Reasoning Problems with Language Models. arXiv 2022, arXiv:2206.14858. [Google Scholar]
- Creswell, A.; Shanahan, M. Faithful Reasoning Using Large Language Models. arXiv 2022, arXiv:2208.14271. [Google Scholar]
- Drori, I.; Zhang, S.; Shuttleworth, R.; Tang, L.; Lu, A.; Ke, E.; Liu, K.; Chen, L.; Tran, S.; Cheng, N.; et al. A Neural Network Solves, Explains, and Generates University Math Problems by Program Synthesis and Few-Shot Learning at Human Level. Proc. Natl. Acad. Sci. USA 2022, 119, 32. [Google Scholar] [CrossRef] [PubMed]
- Triantafillou, E.; Zhu, T.; Dumoulin, V.; Lamblin, P.; Xu, K.; Goroshin, R.; Gelada, C.; Swersky, K.; Manzagol, P.-A.; Larochelle, H. Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples. arXiv 2020, arXiv:1903.03096. [Google Scholar]
- Ramesh, A.; Pavlov, M.; Goh, G.; Gray, S.; Voss, C.; Radford, A.; Chen, M.; Sutskever, I. Zero-Shot Text-to-Image Generation. arXiv 2021, arXiv:2102.12092. [Google Scholar]
- Zhang, P.; Dou, H.; Zhang, W.; Zhao, Y.; Li, S.; Qin, Z.; Li, X. Versatilegait: A Large-Scale Synthetic Gait Dataset Towards in-the-Wild Simulation. arXiv 2022, arXiv:2105.14421. [Google Scholar]
- Solaiman, I.; Dennison, C. Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets. arXiv 2021, arXiv:2106.10328. [Google Scholar]
- Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-Resolution Image Synthesis with Latent Diffusion Models. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 10674–10685. [Google Scholar]
- Yang, L.; Zhang, Z.; Hong, S.; Xu, R.; Zhao, Y.; Shao, Y.; Zhang, W.; Yang, M.-H.; Cui, B. Diffusion Models: A Comprehensive Survey of Methods and Applications. arXiv 2022, arXiv:2209.00796. [Google Scholar]
- Karras, T.; Laine, S.; Aila, T. A Style-Based Generator Architecture for Generative Adversarial Networks. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 4396–4405. [Google Scholar]
- Luo, C. Understanding Diffusion Models: A Unified Perspective. arXiv 2022, arXiv:2208.11970. [Google Scholar]
- Weng, L. What Are Diffusion Models? Available online: https://lilianweng.github.io/posts/2021-07-11-diffusion-models/ (accessed on 23 October 2022).
- Generative Modeling by Estimating Gradients of the Data Distribution. Available online: https://yang-song.net/blog/2021/score/ (accessed on 23 October 2022).
- Sohl-Dickstein, J.N.; Weiss, E.; Maheswaranathan, N.; Ganguli, S. Deep Unsupervised Learning Using Nonequilibrium Thermodynamics. arXiv 2015, arXiv:1503.03585. [Google Scholar]
- Liu, N.; Li, S.; Du, Y.; Torralba, A.; Tenenbaum, J. Compositional Visual Generation with Composable Diffusion Models. arXiv 2022, arXiv:2206.01714. [Google Scholar]
- Search Engine: You. Available online: https://you.com/ (accessed on 23 October 2022).
- Reed, S.; Zolna, K.; Parisotto, E.; Colmenarejo, S.; Novikov, A.; Barth-Maron, G.; Gimenez, M.; Sulsky, Y.; Kay, J.; Springenberg, J.; et al. A Generalist Agent. arXiv 2022, arXiv:2205.06175. [Google Scholar]
- Gato as the Dawn of Early Agi. Available online: https://www.lesswrong.com/posts/TwfWTLhQZgy2oFwK3/gato-as-the-dawn-of-early-agi (accessed on 23 October 2022).
- Why I Think Strong General Ai Is Coming Soon. Available online: https://www.lesswrong.com/posts/K4urTDkBbtNuLivJx/why-i-think-strong-general-ai-is-coming-soon (accessed on 23 October 2022).
- Huang, J.; Gu, S.; Hou, L.; Wu, Y.; Wang, X.; Yu, H.; Han, J. Large Language Models Can Self-Improve. arXiv 2022, arXiv:2210.11610. [Google Scholar]
- Sheng, A.; Padmanabhan, S. Self-Programming Artificial Intelligence Using Code-Generating Language Models; OpenReview 2022. Available online: https://openreview.net/forum?id=SKat5ZX5RET (accessed on 23 October 2022).
- Laskin, M.; Wang, L.; Oh, J.; Parisotto, E.; Spencer, S.; Steigerwald, R.; Strouse, D.; Hansen, S.; Filos, A.; Brooks, E.; et al. In-Context Reinforcement Learning with Algorithm Distillation. arXiv 2022, arXiv:2210.14215. [Google Scholar]
- Fawzi, A.; Balog, M.; Huang, A.; Hubert, T.; Romera-Paredes, B.; Barekatain, M.; Novikov, A.; Ruiz, F.R.; Schrittwieser, J.; Swirszcz, G.; et al. Discovering Faster Matrix Multiplication Algorithms with Reinforcement Learning. Nature 2022, 610, 47–53. [Google Scholar] [CrossRef]
- Strassen, V. Gaussian Elimination Is Not Optimal. Numer. Math. 1969, 13, 354–356. [Google Scholar] [CrossRef]
- Kauers, M.; Moosbauer, J. The Fbhhrbnrssshk-Algorithm for Multiplication in Z5 × 52 Is Still Not the End of the Story. arXiv 2022, arXiv:2210.04045. [Google Scholar]
- The Bitter Lesson. Available online: http://www.incompleteideas.net/IncIdeas/BitterLesson.html (accessed on 23 October 2022).
- Lee, K.-H.; Nachum, O.; Yang, M.; Lee, L.; Freeman, D.; Xu, W.; Guadarrama, S.; Fischer, I.; Jang, E.; Michalewski, H.; et al. Multi-Game Decision Transformers. arXiv 2022, arXiv:2205.15241. [Google Scholar]
- Stephen Wolfram Writings: Games and Puzzles as Multicomputational Systems. Available online: https://writings.stephenwolfram.com/2022/06/games-and-puzzles-as-multicomputational-systems/ (accessed on 23 October 2022).
- Cui, Z.J.; Wang, Y.; Shafiullah, N.; Pinto, L. From Play to Policy: Conditional Behavior Generation from Uncurated Robot Data. arXiv 2022, arXiv:2210.10047. [Google Scholar]
- Du, N.; Huang, Y.; Dai, A.; Tong, S.; Lepikhin, D.; Xu, Y.; Krikun, M.; Zhou, Y.; Yu, A.; Firat, O.; et al. Glam: Efficient Scaling of Language Models with Mixture-of-Experts. arXiv 2022, arXiv:2112.06905. [Google Scholar]
- Zhang, S.; Roller, S.; Goyal, N.; Artetxe, M.; Chen, M.; Chen, S.; Dewan, C.; Diab, M.; Li, X.; Lin, X.; et al. Opt: Open Pre-Trained Transformer Language Models. arXiv 2022, arXiv:2205.01068. [Google Scholar]
- Facebook Research: Chronicles of OPT Development. Available online: https://github.com/facebookresearch/metaseq/tree/main/projects/OPT/chronicles (accessed on 23 October 2022).
- How Much of Ai Progress Is from Scaling Compute? And How Far Will It Scale? Available online: https://www.metaculus.com/notebooks/10688/how-much-of-ai-progress-is-from-scaling-compute-and-how-far-will-it-scale/ (accessed on 23 October 2022).
- Micikevicius, P.; Stosic, D.; Burgess, N.; Cornea, M.; Dubey, P.; Grisenthwaite, R.; Ha, S.; Heinecke, A.; Judd, P.; Kamalu, J.; et al. Fp8 Formats for Deep Learning. arXiv 2022, arXiv:2209.05433. [Google Scholar]
- The First Posit-Based Processor Core Gave a Ten-Thousandfold Accuracy Boost. Available online: https://spectrum.ieee.org/floating-point-numbers-posits-processor (accessed on 23 October 2022).
- Mosaic Llms (Part 2): GPT-3 Quality for <$500 k. Available online: https://www.mosaicml.com/blog/gpt-3-quality-for-500k (accessed on 23 October 2022).
- Yang, G.; Hu, E.; Babuschkin, I.; Sidor, S.; Liu, X.; Farhi, D.; Ryder, N.; Pachocki, J.; Chen, W.; Gao, J. Tensor Programs V: Tuning Large Neural Networks Via Zero-Shot Hyperparameter Transfer. arXiv 2022, arXiv:2203.03466. [Google Scholar]
- Nagarajan, A.; Sen, S.; Stevens, J.; Raghunathan, A. Axformer: Accuracy-Driven Approximation of Transformers for Faster, Smaller and More Accurate Nlp Models. In Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 18–23 July 2022; pp. 1–8. [Google Scholar]
- Stelzer, F.; Röhm, A.; Vicente, R.; Fischer, I.; Yanchuk, S. Deep Neural Networks Using a Single Neuron: Folded-in-Time Architecture Using Feedback-Modulated Delay Loops. Nat. Commun. 2021, 12, 5164. [Google Scholar] [CrossRef] [PubMed]
- Kirstain, Y.; Lewis, P.; Riedel, S.; Levy, O. A Few More Examples May Be Worth Billions of Parameters. arXiv 2021, arXiv:2110.04374. [Google Scholar]
- Schick, T.; Schütze, H. Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference. arXiv 2021, arXiv:2001.07676. [Google Scholar]
- Hoffmann, J.; Borgeaud, S.; Mensch, A.; Buchatskaya, E.; Cai, T.; Rutherford, E.; Casas, D.; Hendricks, L.; Welbl, J.; Clark, A.; et al. Training Compute-Optimal Large Language Models. arXiv 2022, arXiv:2203.15556. [Google Scholar]
- Wang, W.; Bao, H.; Dong, L.; Bjorck, J.; Peng, Z.; Liu, Q.; Aggarwal, K.; Mohammed, O.; Singhal, S.; Som, S.; et al. Image as a Foreign Language: Beit Pretraining for All Vision and Vision-Language Tasks. arXiv 2022, arXiv:2208.10442. [Google Scholar]
- New Scaling Laws for Large Language Models. Available online: https://www.lesswrong.com/posts/midXmMb2Xg37F2Kgn/new-scaling-laws-for-large-language-models (accessed on 23 October 2022).
- Trees Are Harlequins, Words Are Harlequins. Available online: https://nostalgebraist.tumblr.com/post/680262678831415296/an-exciting-new-paper-on-neural-language-model (accessed on 23 October 2022).
- Understanding Scaling Laws for Recommendation Models. Available online: https://threadreaderapp.com/thread/1563455844670246912.html (accessed on 23 October 2022).
- Jurassic-X: Crossing the Neuro-Symbolic Chasm with the Mrkl System. Available online: https://www.ai21.com/blog/jurassic-x-crossing-the-neuro-symbolic-chasm-with-the-mrkl-system (accessed on 23 October 2022).
- Introducing Adept. Available online: https://www.adept.ai/post/introducing-adept (accessed on 23 October 2022).
- Hugging Face: Transformers. Available online: https://github.com/huggingface/transformers (accessed on 23 October 2022).
- Democratizing Access to Large-Scale Language Models with Opt-175b. Available online: https://ai.facebook.com/blog/democratizing-access-to-large-scale-language-models-with-opt-175b/ (accessed on 23 October 2022).
- Why Tool Ais Want to Be Agent Ais. Available online: https://www.gwern.net/Tool-AI (accessed on 23 October 2022).
- Wunderwuzzi’s Blog: GPT-3 and Phishing Attacks. Available online: https://embracethered.com/blog/posts/2022/gpt-3-ai-and-phishing-attacks/ (accessed on 23 October 2022).
- Wu, Y.; Jiang, A.; Li, W.; Rabe, M.; Staats, C.; Jamnik, M.; Szegedy, C. Autoformalization with Large Language Models. arXiv 2022, arXiv:2205.12615. [Google Scholar]
- Fei, N.; Lu, Z.; Gao, Y.; Yang, G.; Huo, Y.; Wen, J.; Lu, H.; Song, R.; Gao, X.; Xiang, T.; et al. Towards Artificial General Intelligence Via a Multimodal Foundation Model. Nat. Commun. 2022, 13, 3094. [Google Scholar] [CrossRef] [PubMed]
- Caccia, M.; Mueller, J.; Kim, T.; Charlin, L.; Fakoor, R. Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline. arXiv 2022, arXiv:2205.14495. [Google Scholar]
- Fan, L.; Wang, G.; Jiang, Y.; Mandlekar, A.; Yang, Y.; Zhu, H.; Tang, A.; Huang, D.-A.; Zhu, Y.; Anandkumar, A. Minedojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge. arXiv 2022, arXiv:2206.08853. [Google Scholar]
- My Bet: Ai Size Solves Flubs. Available online: https://astralcodexten.substack.com/p/my-bet-ai-size-solves-flubs (accessed on 23 October 2022).
- What Does It Mean When an Ai Fails? A Reply to Slatestarcodex’s Riff on Gary Marcus. Available online: https://garymarcus.substack.com/p/what-does-it-mean-when-an-ai-fails (accessed on 23 October 2022).
- Somewhat Contra Marcus on Ai Scaling. Available online: https://astralcodexten.substack.com/p/somewhat-contra-marcus-on-ai-scaling (accessed on 23 October 2022).
- Fitzgerald, M.; Boddy, A.; Baum, S. 2020 Survey of Artificial General Intelligence Projects for Ethics, Risk, and Policy. Global Catastrophic Risk Institute Technical Report 20-1. 2020. Available online: https://gcrinstitute.org/papers/055_agi-2020.pdf (accessed on 23 October 2022).
- Metaculus: Date Weakly General Ai Is Publicly Known. Available online: https://www.metaculus.com/questions/3479/date-weakly-general-ai-is-publicly-known/ (accessed on 23 October 2022).
- Superglue Leaderboard Version: 2.0. Available online: https://super.gluebenchmark.com/leaderboard/ (accessed on 23 October 2022).
- Roy, R.; Raiman, J.; Kant, N.; Elkin, I.; Kirby, R.; Siu, M.; Oberman, S.; Godil, S.; Catanzaro, B. Prefixrl: Optimization of Parallel Prefix Circuits Using Deep Reinforcement Learning. In Proceedings of the 2021 58th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 5–9 December 2021; pp. 853–858. [Google Scholar]
- Kelly, B.T.; Malamud, S.; Zhou, K. The Virtue of Complexity in Return Prediction. Natl. Bur. Econ. Res. Work. Pap. Ser. 2022, 30217, 21–90. [Google Scholar] [CrossRef]
- Are You Really in a Race? The Cautionary Tales of Szilárd and Ellsberg. Available online: https://forum.effectivealtruism.org/posts/cXBznkfoPJAjacFoT/are-you-really-in-a-race-the-cautionary-tales-of-szilard-and (accessed on 23 October 2022).
- The Time Is Now to Develop Community Norms for the Release of Foundation Models. Available online: https://hai.stanford.edu/news/time-now-develop-community-norms-release-foundation-models (accessed on 23 October 2022).
- Agi Ruin: A List of Lethalities. Available online: https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities (accessed on 23 October 2022).
- Lewis, M.; Yarats, D.; Dauphin, Y.; Parikh, D.; Batra, D. Deal or No Deal? End-to-End Learning of Negotiation Dialogues. arXiv 2017, arXiv:1706.05125. [Google Scholar]
- Ought, Inc. Interactive Composition Explorer. Available online: https://github.com/oughtinc/ice (accessed on 23 October 2022).
- Shu, T.; Bhandwaldar, A.; Gan, C.; Smith, K.; Liu, S.; Gutfreund, D.; Spelke, E.; Tenenbaum, J.; Ullman, T. Agent: A Benchmark for Core Psychological Reasoning. arXiv 2021, arXiv:2102.12321. [Google Scholar]
- Aligned AI: The Happy Faces Benchmark. Available online: https://github.com/alignedai/HappyFaces (accessed on 23 October 2022).
- Kenton, Z.; Everitt, T.; Weidinger, L.; Gabriel, I.; Mikulik, V.; Irving, G. Alignment of Language Agents. arXiv 2021, arXiv:2103.14659. [Google Scholar]
- Weidinger, L.; Mellor, J.; Rauh, M.; Griffin, C.; Uesato, J.; Huang, P.-S.; Cheng, M.; Glaese, M.; Balle, B.; Kasirzadeh, A.; et al. Ethical and Social Risks of Harm from Language Models. arXiv 2021, arXiv:2112.04359. [Google Scholar]
- Glaese, A.; McAleese, N.; Trkebacz, M.; Aslanides, J.; Firoiu, V.; Ewalds, T.; Rauh, M.; Weidinger, L.; Chadwick, M.; Thacker, P.; et al. Improving Alignment of Dialogue Agents Via Targeted Human Judgements. arXiv 2022, arXiv:2209.14375. [Google Scholar]
- Xie, C.; Cai, H.; Song, J.; Li, J.; Kong, F.; Wu, X.; Morimitsu, H.; Yao, L.; Wang, D.; Leng, D.; et al. Zero and R2D2: A Large-Scale Chinese Cross-Modal Benchmark and a Vision-Language Framework; 2022.; arXiv 2022, arXiv: 2205. 0386. [Google Scholar]
- Nvidia Omniverse Replicator Generates Synthetic Training Data for Robots. Available online: https://developer.nvidia.com/blog/generating-synthetic-datasets-isaac-sim-data-replicator/ (accessed on 23 October 2022).
- Starke, S.; Zhang, H.; Komura, T.; Saito, J. Neural State Machine for Character-Scene Interactions. ACM Trans. Graph. 2019, 38, 1–14. [Google Scholar] [CrossRef] [Green Version]
- Liu, R.; Wei, J.; Gu, S.S.; Wu, T.-Y.; Vosoughi, S.; Cui, C.; Zhou, D.; Dai, A. Mind’s Eye: Grounded Language Model Reasoning through Simulation. arXiv 2022, arXiv:2210.05359. [Google Scholar]
- Mitrano, P.; Berenson, D. Data Augmentation for Manipulation. arXiv 2022, arXiv:2205.02886. [Google Scholar]
- Karpas, E.D.; Abend, O.; Belinkov, Y.; Lenz, B.; Lieber, O.; Ratner, N.; Shoham, Y.; Bata, H.; Levine, Y.; Leyton-Brown, K.; et al. Mrkl Systems: A Modular, Neuro-Symbolic Architecture That Combines Large Language Models, External Knowledge Sources and Discrete Reasoning. arXiv 2022, arXiv:2205.00445. [Google Scholar]
- Ling, H.; Kreis, K.; Li, D.; Kim, S.; Torralba, A.; Fidler, S. Editgan: High-Precision Semantic Image Editing. arXiv 2021, arXiv:2111.03186. [Google Scholar]
- Fedus, W.; Dean, J.; Zoph, B. A Review of Sparse Expert Models in Deep Learning. arXiv 2022, arXiv:2209.01667. [Google Scholar]
- Rajbhandari, S.; Li, C.; Yao, Z.; Zhang, M.; Aminabadi, R.; Awan, A.; Rasley, J.; He, Y. Deepspeed-Moe: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation Ai Scale. In Proceedings of the 39th International Conference on Machine Learning, Baltimore, MD, USA, 17–23 July 2022; pp. 18332–18346. [Google Scholar]
- Kittur, A.; Yu, L.; Hope, T.; Chan, J.; Lifshitz-Assaf, H.; Gilon, K.; Ng, F.; Kraut, R.; Shahaf, D. Scaling up Analogical Innovation with Crowds and Ai. Proc. Natl. Acad. Sci. USA 2019, 116, 16654. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wang, C.-Y.; Yeh, I.-H.; Liao, H. You Only Learn One Representation: Unified Network for Multiple Tasks. arXiv 2021, arXiv:2105.04206. [Google Scholar]
- Meng, K.; Bau, D.; Andonian, A.; Belinkov, Y. Locating and Editing Factual Associations in GPT; 2022. [Google Scholar]
- Meet Loab, the Ai Art Woman Haunting the Internet. Available online: https://www.cnet.com/science/what-is-loab-the-haunting-ai-art-woman-explained/ (accessed on 23 October 2022).
- Weng, L. Learning with Not Enough Data Part 1: Semi-Supervised Learning. Available online: https://lilianweng.github.io/posts/2021-12-05-semi-supervised/ (accessed on 23 October 2022).
- Davis, K.M.; Torre-Ortiz, C.; Ruotsalo, T. Brain-Supervised Image Editing. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Machado, K.; Kank, R.; Sonawane, J.; Maitra, S. A Comparative Study of Acid and Base in Database Transaction Processing. Int. J. Sci. Eng. Res. 2017, 8, 116–119. [Google Scholar]
- Lesswrong: Comment by User Gwern. Available online: https://www.lesswrong.com/posts/uKp6tBFStnsvrot5t/what-dall-e-2-can-and-cannot-do?commentId=CWKFyJYfgoZfP9955 (accessed on 23 October 2022).
- Bostrom, N. Base Camp for Mt. Ethics DRAFT version 0.9 2022. Available online: https://nickbostrom.com/papers/mountethics.pdf (accessed on 23 October 2022).
- Wang, Z.; Yu, A.; Firat, O.; Cao, Y. Towards Zero-Label Language Learning. arXiv 2021, arXiv:2109.09193. [Google Scholar]
- Ge, X.; Zhang, K.; Gribizis, A.; Hamodi, A.; Sabino, A.; Crair, M. Retinal Waves Prime Visual Motion Detection by Simulating Future Optic Flow. Science 2021, 373, 6553. [Google Scholar] [CrossRef] [PubMed]
- Import Ai 269: Baidu Takes on Meena; Microsoft Improves Facial Recognition with Synthetic Data; Unsolved Problems in Ai Safety. Available online: https://jack-clark.net/2021/10/11/import-ai-269-baidu-takes-on-meena-microsoft-improves-facial-recognition-with-synthetic-data-unsolved-problems-in-ai-safety/ (accessed on 23 October 2022).
- Touvron, H.; Cord, M.; Jegou, H. Deit Iii: Revenge of the Vit. arXiv 2021, arXiv:2204.07118. [Google Scholar]
RQ1 | What methodologies and frameworks can facilitate annotation, especially those with a multimodal nature? |
RQ2 | How to encode data in formats which facilitate safe and ethical interchange, as well as the coding of expansive and representative modalities/categorizations? |
RQ3 | How to streamline the user experience to reduce cognitive load and training requirements? |
RQ4 | How to augment user contributions to increase their impact? |
RQ5 | How to validate coded information as being reasonable and appropriate? |
RQ6 | How to pre-process data or to permit pre-annotation |
RQ7 | How can Transformer-type technologies be applied to annotation? |
RD1 | Scopus | www.scopus.com |
RD2 | IEEE Xplore | ieeexplore.ieee.org |
RD3 | Science Direct | www.sciencedirect.com |
RD4 | Elicit | www.elicit.org |
RD5 | WorldCat | www.worldcat.org |
RD6 | Google Scholar | scholar.google.com |
RD7 | ArXiv | www.arxiv.org |
The study employed tools for annotating behavior that embodied the following keywords: (a) annotation and (b) behaviors. |
The study examined included either (c) a collaborative analysis mechanism, or (d) an element of automation, both elements providing a means of amplifying. |
The study reported the research methods applied (i.e., the type data being generated, the technologies employed, the intended use case, the general research design). |
The research presented in one study did not overlap with research from another study. In such cases, a note was taken of the original research, but reporting focused on the lattermost results. |
The study was written during or after the year 2000. |
The article was written in English, or a professional translation was readily available. |
DeepMind | Gopher |
BERT, T5, LAMDA, MUM, PaLM | |
Huawei | PANGU-Alpha |
Microsoft | Turing-NLG |
Meta | BART, ROBERTa, XLM, OPT |
Nvidia | MEGATRONLM |
OpenAI | GPT series, DALL·E, CLIP, Codex |
Open Source | BLOOM, GPT-NeoX, GPT-J |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Watson, E.; Viana, T.; Zhang, S. Augmented Behavioral Annotation Tools, with Application to Multimodal Datasets and Models: A Systematic Review. AI 2023, 4, 128-171. https://doi.org/10.3390/ai4010007
Watson E, Viana T, Zhang S. Augmented Behavioral Annotation Tools, with Application to Multimodal Datasets and Models: A Systematic Review. AI. 2023; 4(1):128-171. https://doi.org/10.3390/ai4010007
Chicago/Turabian StyleWatson, Eleanor, Thiago Viana, and Shujun Zhang. 2023. "Augmented Behavioral Annotation Tools, with Application to Multimodal Datasets and Models: A Systematic Review" AI 4, no. 1: 128-171. https://doi.org/10.3390/ai4010007
APA StyleWatson, E., Viana, T., & Zhang, S. (2023). Augmented Behavioral Annotation Tools, with Application to Multimodal Datasets and Models: A Systematic Review. AI, 4(1), 128-171. https://doi.org/10.3390/ai4010007