Current Market Demand for Core Competencies of Librarianship — A Text Mining Study of American Library Association ’ s Advertisements from 2009 through 2014

As librarianship evolves, it is important to examine the changes that have taken place in professional requirements. To provide an understanding of the current market demand for core competencies of librarianship, this article conducts a semi-automatic methodology to analyze job advertisements (ads) posted on the American Library Association (ALA) Joblist from 2009 through 2014. There is evidence that the ability to solve unexpected complex problems and to provide superior customer service gained increasing importance for librarians during those years. The authors contend that the findings in this report question the status quo of core competencies of librarianship in the US job market.


Introduction
As a direct offshoot of the terms Business 2.0 and Web 2.0, Michael Casey coined the term "Library 2.0" on his blog LibraryCrunch."Library 2.0" is a loosely defined model for a modernized form of library service, reflecting a transition within the library world in the way that services are delivered to users [1].The focus is on user-centered change and participation in the creation of content and community [2].According to the definition, the developments in Web tools and techniques are of great importance in making Library 2.0 possible.These developments have had a great impact on the skills and experiences required of librarians.As an extension of Library 2.0, the term "Librarian 2.0" is used to imply a change towards a revised understanding of the core competencies and qualities of librarianship [3].
As librarianship continues to evolve, it is important to examine the changes that have taken place in professional requirements.As others have claimed, analyzing librarian job advertisements (ads) can help accomplish this task [4].Library school faculty may use these findings to develop curricula more closely aligned with real-life requirements of the workplace.Library administrators may gauge areas of growth and potential needs for organizational planning based on this analysis.It can also help to provide appropriate terminology for writing job ads to fill these needs.Finally, current library professionals can use this data to identify their continuing education needs while library school students and those considering library school may base their course choices on the enhanced core competencies and qualities of librarianship.
ALA's Core Competences of Librarianship, approved by the ALA Executive Board in 2008 and later by the ALA Council in 2009, define basic knowledge that all librarians should possess.The document is authoritative and broad-too broad.All the basic knowledge is important from the perspective of education.However, a number of the core competencies are seldom part of the requirements listed in job ads.For example, "The history of libraries and librarianship" wouldn't be necessarily to show up in any advertisements as a requirement.
During the time period from 2009 through 2014, the librarian job market has been driven by the pace of the digital age.A significant issue is to identify the core competencies for the current generation of American librarians.To provide an understanding of the current market demand for core competencies of librarianship, this article uses sophisticated research methodology to identify the trends and changes in the field, processing many more job ads than former studies.This article aims to answer the following 3 questions: 1.
What kind of professional librarian positions are in demand in the current American job market?2.
What key skills and experiences are significant to a new generation of library professionals? 3.
What requirements and qualifications are in alignment with ALA core competencies?
The authors contend that the findings reported in this article may provide a more comprehensive understanding of current core competencies of librarianship.

Core Competencies Identification among LIS
The study of core competencies among library and information science (LIS) as a research field is well documented in the literature.ALA approved and adopted a document which defines the knowledge to be possessed by all persons graduating from ALA-accredited master's programs in library and information studies [5].This document lists nearly all possible competencies without priority so that it provides no indication of which competencies are most important to satisfy diverse job requirements.A focus group study of Librarian 2.0 characteristics was applied among LIS professionals in Australia [6].They found that, besides qualifications, certain personal traits are essential in library work today and will be in the future.The important skills in the following areas were identified: communication, change management, collaboration, information management, leadership, marketing, project management and community engagement.Further, a Librarian 2.0 should be innovative, adaptable, flexible and active in keeping current.However, both the researchers and the focus group participants raised concerns about the label "Librarian 2.0".This label was believed to draw too much focus to Web 2.0 tools, relegating important issues of participation to the background.
Job ad analysis is a frequently-used research method in this field.Choi, Youngok, and Edie Rasmussen use job ads from 1999 to 2007 to identify which qualifications and skills are important for digital librarian positions in academic libraries [7].They made a codebook based on ALA's Core Competences of Librarianship and found that qualifications about technology-related experience, institution management, resource building and knowledge organization are most frequently required.They also indicated an increasing demand for interpersonal and communication skills, and adaptive skills.Matthews and Pardue studied the contents of job ads from ALA's online JobLIST [8].Their findings indicate that librarians need a wider range of IT skills.Web development skill is the most required skill, followed by program management and system development.However, programming language is least required.
Limited by the number of job ads able to be processed, former researchers usually focused on an exact type of position.Electronic resources librarian, as a position most related to the digital age, is a hot spot in this field.Using job ads between 2005 and 2009, Sutton (2011) developed a list of core competencies for electronic resource librarians "to determine which competencies are taught in MLIS programs, which are taught in continuing education, and whether significant differences exist between the two education venues" [9].After refining the list, North American Serials Interest Group (NASIG) adopted it as NASIG's Core Competencies for Electronic Resources Librarians in July 2013 [10].Built on Sutton's work, Hartnett expanded the time period of job ads considered and analyzed the changing trends in electronic resources librarian responsibilities and requirements [11].

Content Analysis Methodology in Current Job Ad Research
Each new study in this area has introduced methodology based on previous studies.To identify competencies from job ads, Sutton (2011) utilizes a content analysis methodology proposed by Krippendorff [12].After selecting related job ads based on Heimer's study [13], Sutton adopts a random sample to create a codebook, ensuring that all texts in the corpus are coded in a consistent manner throughout the content analysis.Then Sutton transposes all selected data into the codebook to recognize the reliability of the result.The coding result of Sutton is reliable; however, the manual methodology can process only a limited amount of data, which does not serve in this research to integrally identify the essential core competencies for Librarian 2.0.
Debortoli studied the competencies required for business intelligence and big data professionals by performing an automated content analysis of job ads using a text mining technique called latent semantic analysis (LSA) [14], which is a quantitative method for analyzing qualitative data.Assuming that contexts in which a word appears largely determine the word's meaning, LSA extracts word usage patterns and their meaning through statistical computations [15].This method may process a large amount of data, but it cannot utilize the result of former studies.Since job ads are full of professional terms and diverse expressions, the validity of this method cannot be guaranteed.

Data Selection in Current Librarian Job Ad Research
The end result of a study is necessarily determined by the original raw data.The data selection step plays an important role in content analysis.One assumption in job ad analysis research is that a job advertisement describes the important characteristics of a job.Studies show that job advertisements are created with the intention of describing the knowledge, skills, and abilities required of an incumbent so that the expectations of both candidates and hiring managers can be clearly linked to the job positions [16].It is reasonable to assume that job advertisements act as proxies for a demand for human capital in industry and that they can provide insights into competency requirements.However, one must be aware that job ads do not always reflect an employer's true requirements, as the employer may ask for more competencies than can be reasonably expected from an applicant.
Another assumption is that advertised job positions may represent all job positions in a particular category.Sometimes this assumption is openly discussed and addressed as a limitation.Indeed, there is discrepancy between existing jobs held by working librarians at any given time and jobs that are being advertised and therefore open at a particular moment, either in a new position or for replacement.This has been specifically highlighted by some researchers who seek to identify what is newly desired in the jobs advertised [17].
The final assumption is that, the job ads which a study analyzes may represent all job ads in a specific category.By comparing different data sources, Applegate claimed that commonly used print sources provide only a small fraction of available positions [18].He found that ALA Joblist and Academic Employment Network are job ad aggregators with the widest coverage, but also suggested that obtaining job ads directly from institutions may provide more representative data.

Data Selection and Data Preprocessing
The data analyzed in this study is from ALA Joblist's database [19] from 2009 through 2014.According to Applegate's work, ALA Joblist covers 23% of available academic library positions, ranked only next to Academic Employment Network [18].Although directly gathering data from institutions may provide more representative data, it would be much too time-consuming.
The data source, however, brings deviation between the end result of the study and actual job market.For example, Applegate's work shows that ALA Joblist covers a remarkably higher percentage of job ads from masters (20%) and doctoral (27%) institutions than from associates (0%) and baccalaureate (10%) institutions [18].Another subsequent potential limitation is that some job ads posted on ALA Joblist may point to a Web page where the position is more fully described.These Web pages are typically removed from the Web after the position is filled, making the information more difficult to obtain [20].
With a total of 9573 job ads posted on ALA Joblist from 1 January 2009 to 25 June 2014, the data include both full-texts and basic information showing brief descriptions.As the first step of the data analysis, Figure 1 shows the path and procedure of the data processing.After discarding job ads from Canada, there are 9198 job ads left as the total of data records analyzed directly in this study using statistical computing techniques.Their full texts are preprocessed further for content analysis.In the preprocessing stage, the authors transition each full-text ad into several record units (see Figure 2).In his seminal text on content analysis, Krippendorff identifies three separate units of analysis that must be considered in content analysis: sampling units, context units, and recording units [12].In this study, sampling units are full-texts of all 9198 job ads; context units are blocks describing requirements; recording units are segments in context units, each of which is supposed to describe a competency.The example for context units and recording units is shown in Figure 3.

Data Collection and Selection
To extract context units from a sampling unit, the authors divide it into sentences and build a rule-based classifier, which is designed to determine whether a sentence belongs to a context unit.For example, a context unit usually begins with a "Required Qualifications:" in a single line and ends before an url link for application (See Figure 3).The classifier gets a hit rate of 90% and a recall rate of 84% in sentences from 10 randomly picked job ads.
After that, the authors discard job ads without context unit, in which no job requirements are mentioned, and divide the remaining sentences into recording units (recording units may be divided by semicolons).Via this process, the authors try to avoid mixing multiple competencies in one recording unit.
Finally the authors transfer recording units from text into "bag-of-words" (a sparse vector of occurrence counts of words).To unify the form of word set, words are stemmed with toolkit in nltk and irrelevant words (stop words e.g.) are removed.

Codebook Construction
In this study, coding means transforming each job ad into a "bag-of-competencies" (a sparse vector of occurrence of competency tags).As job ads have been transferred into multiple bags of stemmed words (each recording unit corresponds to a bag of stemmed words) in data preprocessing, a codebook in this study is consist of serval rule-based classifiers.Each classifier may judge whether a job ad requests a certain competency.A classifier contains a number of rules, each of which contains a word set and a corresponding competency tag (a competency tag is usually correspo).Different rules belonging to the same classifier contain different word sets and the same competency tag.
Figure 4 shows part of the final codebook in this study.For rule "Analytical, problem solving skills: analyz; problem", if "analyz" and "problem" appear in one recording unit (the recording unit has been transferred into a bag of stemmed words) at the same time, it is said that the recording unit meets the rule and we may add a competency tag of Analytical, problem solving skills to this recording unit.If any recording units in a job ad own a competency tag of "Analytical, problem solving skills", it is concluded that the job ad requests a competency of Analytical, problem solving skills.All classifiers in the codebook will be applied to a job ad in turn to transform it into a bag of competencies.1).Then, the authors manually construct a competency tag, a detailed explanation and a set of rules for each competency themes according to their samples.Each rule in the codebook grows from one meaningful stemmed word (like "collect", which indicates competencies related to "collection") and one temporary competency tag.However, a rule with only one stemmed word may get a low hit rate.To extend the rule, the authors manually pick other stemmed words one by one, making the rule more accurate.When the rule becomes accurate enough, the competency tag for the rule will be determined (either an existed tag or a new one).Of special note is that the rule may grow in multiple branches (see Figure 6).The selection of "meaningful stemmed words" is based on an assumption that if a meaningful word or phrase may compose a rule, it will appear much more frequently in recording units belonging to the corresponding theme than in other recording units.After applying a codebook to all recording units, we may add multiple competency tags for each of them.To simplify the definition, we add a special tag called "None" to recording units without any competency tags so that all recording units own at least one tag.Supposing C (contains c 1 , c 2 ... c n ) is the set of all competency tags in the codebook, R stands for all recording units, R ci stands for recording units with tag c i .For each stemmed word in all recording units, the authors may calculate n TF-IDF (term frequency-inverse document frequency), each of which responds to a competency tag.Suppose w is a stemmed word, RC ci,w stands for the number of recording units containing w and belonging to R ci .Define the importance of w as follows: {t f id f ci,w } E ci,w = 0 , if w appears in any rules belonging to the classifier for c i 1 , otherwise Sorting all stemmed words by their importance, the authors pick one of them from the top to optimize the codebook.Once the codebook is changed, we may update the importance of all stemmed words.The iteration is repeated until no obvious new rules can be found.

The Roles of the Two Specified Codebooks
Simply put, the first codebook (see Table 2) contains basic job position information drawn from the records of ALA Joblist's job advertisements, such as job title, location, institution, minimum required degree, and years of work experience.This data is identified as the original supplying basic data elements, which are used to conduct descriptive statistical analysis.Data elements in the sheet are well structured so that the authors can directly analyze them with Excel 2013.The codebook for competencies is more complex (Table 1, without key word groups).All job ads are coded with the final codebook constructed in Section 3.3 to extract the competencies that they request (see Table 3).

The Number of Job Ads are Consistently Increasing
The number of job advertisements posted on ALA Joblist increased substantially from 2009 to 2013, about 18% per year on average.There were 1897 job ads posted in 2014 before June 25th, which implies a continuing increase in the number of job ads.The increase in positions advertised may serve as one indicator of the growing of the market.Figure 7 shows the distribution of ads per year.

Academic Libraries Offer Most Job Positions
Among the 4 major types of libraries, job advertisements from university and college libraries account for 62% overall, indicating academic library jobs are still most in demand.For job seekers, especially library science students searching for professional positions, this information signifies that they might want to concentrate their competencies on academic library work since these libraries provide the greatest number of positions.While public libraries account for 24% of the total positions, showing they are still an important job market for professional librarians, school libraries post the fewest job ads (about 1.58%).This information probably does not imply school media specialists are in less demand, but one possible reason is that school libraries have a low modest professional requirements for their librarians and low advertising budgets, so they are disinclined to post job advertisements on ALA Joblist, while academic libraries are well-funded and demand more highly-qualified librarians.Figure 8 shows the percentage of job ads from the four major types of libraries.

Job Titles are Significantly Diverse
Job titles are not uniform among job ads for a specific professional librarian position.This phenomenon is particularly displayed in recent job openings.While some institutions tend to use job titles that broadly cover work responsibilities, others do not reveal any of the responsibilities of the job-for example, "Librarian I".Some libraries like to expand a job title to describe responsibility, such as "Acquisitions and Electronic Resources Management Librarian" and "Acquisitions and Electronic Resources Librarian".There are also job titles with extra descriptors like "Assistant Librarian, Research and Instruction (Part-time)".Also, a job title may stand for one position with combined responsibility or several positions in one job ad, such as "Electronic Resources/Instruction Librarian". Figure 9 shows that, among 5546 job titles in a total of 9198 job ads, 84% of them appear only once, reflecting what so we may call a "historical change" of library job duties to meet the current needs of information services in this digital era.

Management Positions are Most in Demand
Despite the fact that job titles are not uniform, management positions, especially at the top level like "Library Director", are notably in more demand quantitatively than other types of positions advertised on the ALA Joblist (see Table 4).To obtain the exact proportion of managerial job positions, the authors manually sought out management positions from job titles appearing more than 5 times (2829 job positions in total).If other levels of managerial positions are included-such as "Director of Library Services", "Assistant Library Director" and "Branch Manager", the management positions account for about 40% of the 2829 job ads (see Table 5).
On the other hand, the authors transferred job titles into bags of words, linking synonyms to construct simple clustering based on FP-growth.Four major categories with higher counts emerged: Directors/Heads, Public Services Librarians, Reference and technical services librarians, and electronic reference librarians (see Table 6).

Two Major Requirements Cross Nearly All Librarian Positions
Besides the years of work experience, another most common requirements specified in the ALA Joblist ads is the terminal degree.As shown in Figure 10, 87.46% of the advertised positions require a master's degree.Specifically, an ALA-accredited masters is often stated as a required or preferred qualification in the job description.However, this particular degree requirement may have some exceptions when considering the whole job market.For example, among 6 current job positions posted on the official homepage of MIT Library [21], only 2 of them require a master's degree, while 4 of them require a minimum degree of bachelor.Employers may not be willing to post positions with lower degree requirements on ALA Joblist.
While work experience is another major job requirement, management positions have a higher limitation in length of work experience than positions providing specific services.As Figure 11 shows, except for those without specified experience requirements (39.61%),only a few job positions accept applicants with work experience of less than 2 year (10.77%).Of special note is the experience requirement in relationship to job type.Positions of director want more experienced librarians.Among 530 job ads titled "Library Director", there are 418 of them demanding 5 or more years of work experience.On the other hand, positions providing specific services ("Electronic Resources Librarian", "Reference Librarian", "Youth Services Librarian" and "Systems Librarian", e.g.) are more likely to demand work experience of 2-5 years.

Interpersonal Skills are Top Needed Competencies
Although the job ads belong to different types of job positions, as shown in Table 3, there are top competencies that cut across different types of positions.Among those, communication skills in the category of "Foundations of the Profession" is the top competency that appears in almost half of the total positions.The next top competency in the category "Administration and Management" is Project management that appears in 35% of the ads.Besides the top 2, there are 10 more competencies appearing in at least 10% of the ads.Among these common requirements, "communication skills", "work collaboratively", "leadership" and "customer service orientation" are all related to interpersonal skills, which are much emphasized in comparison to professional skills.One reason may be that professional work skills are usually required by related specific positions only, which causes statistically low frequency on the whole.For example, when focusing on job ads containing "Electronic Resource" in the title, "electronic resources" get a frequency of 66.42% (among 134 job ads).79.49% of "metadata" job ads (156 overall) demand "cataloging" skills.

Analytical, Problem Solving Skills Gain Importance; Reference Skills Become Less Needed
What follows is a look at the competencies that: (1) have a frequency of more than 5%; (2) can easily be tied to a descriptor or descriptors from the current analysis; (3) show a high level of movement over the time period (either positive or negative).
The first competency is "1I.The techniques used to analyze complex problems and create appropriate solutions." Corresponding to changes in librarianship, librarians are more likely to face new complex problems in their daily work.The skills to analyze and solve complex problems gained in importance from 2009 to 2014 (Figure 12)."Change, flexibility", which implies the ability to analyze and solve unexpected problems in a new environment, also showed a significant growth in frequency.
The second competency is "5A.The concepts, principles, and techniques of reference and user services..." "Reference" is a traditional important skill for librarians and is the 4th most common requirement.However, as more and more reference work can be done with information systems, the demand for reference skills is less and less.Its frequency declined from 2009 to 2014 (Figure 13).On the other hand, the requirements related to user services gained a slight increment, which could be attributed to the increasing concentration for user services in Library 2.0.Foremost among the findings is the increasing number of positions posted on ALA Joblist.Although library science students and those considering library school should investigate other factors, such as the number of new graduates entering the market each year, to estimate their competitiveness, the remarkable increase in job ads from 2009 through 2014, which is around 18% per year on average, can be one indicator of the health of the market.
On the one hand, most of job ads advertised on ALA Joblist are posted by academic libraries; on the other hand, administrative and managerial positions are most-needed.Another useful piece of information for job seekers is the demand distribution of different kind of job positions-management positions are most widely in demand, followed by public service librarian, technical service librarian and electronic resources librarian.Because of sample deviation and the difficulty in analyzing job titles, this result may be more indicative than accurate but it's still a good reference.

What Key Skills and Experiences are Significant to a New Generation of Library Professionals?
A professional degree and work experience are two commonly required qualifications.87.46% of positions require a master's degree.Despite potential sample bias, a masters degree, especially an ALA-credited masters degree, can be really helpful for job seekers.Most positions search for applicants with more than 2 years of work experience.That is also consistent with educational and professional requirements/preferences of job ads from 1999 through 2007 [7].
Management positions may not be available for library science students as they are likely to demand work experience of more than 5 years, while positions providing specific services have a lower requirement in this respect.Professional skills are important for related job positions seldom appear in job ads for other positions.Choosing a specific type of position and learning related professional skills is extremely important for library science students as they usually do not have enough work experience to apply for a management positions.
The central component in Library 2.0 is interactivity [1], which is consistent with the result that interpersonal skills are most widely required.Another important fact is that technology and tools change the working mode of library professionals.Reference skills show a decline in significance.Familiarity with new techniques like electronic resources and databases may benefit a new generation of library professionals.Flexibility and the ability to solve complex problems has been steadily gaining in importance in recent years.These findings support the end result of a focus group study of Librarian 2.0 characteristics in Australia [6].

What Requirements and Qualifications are in Alignment with ALA Core Competencies?
As a document covering all basic knowledge for librarians, ALA's core competencies cover most of the competencies identified in this study, except for qualifications.However, this study shows that new professional skills, programming skills and the ability to deal with electronic resources e.g., may need to be specified.
Using the special designed competency codebook (see Table 1) that corresponds to ALA's core competencies and text mining strategies, among 7169 postings, 12 competencies that most frequently appear in the ALA joblist ads can be grouped in these five categories: (1) Interpersonal skills that include "Communication skills", "Change, flexibility" and "Analytical, Problem solving skills"; (2) Leadership that includes "work collaboratively"; (3) Reference and user services that include "customer services orientation"; (4) Cataloging and collection development; and (5) Technological knowledge and skills that include "general technological knowledge" and "electronic resources".

Conclusion and Further Study
All areas of librarianship have evolved and continue evolving over time.The pace of the digital age brought a significant revolution in librarianship, which could be reflected by the change of core competencies of librarianship.This article proposes a semi-automatic method to do content analysis on a vast array of job ads and highlights how ALA's Core Competences of Librarianship may need further refinement.With periodic review and maintenance, the document may evolve with the profession and remain relevant, which is beneficial to those who seek its guidance.
More accurate and rich data surely produces more accurate and comprehensive results.Although the data for this study was retrieved from the ALA Joblist database, one of the largest recruitment websites for librarians in America, still there are about one-third of the data have only simple and basic job descriptions that made construction of a codebook difficult and prevented the implementation of further data mining methods.For example, the text mining method in this article transfers recording units into bags of words and loses syntactic information; Words with high TF-IDF may not be good identifiers for classifying recording units.The authors partly solved these problems with human supervision.The codebook is based on prior work and updated manually to guarantee its validity.On the other hand, as method has provided the researchers with limited candidates in constructing a codebook, it becomes possible to process a large amount of job ads.
As this study used a semi-automatic method to construct a codebook based on a former codebook, it would be easy to apply this method on samples from the same source.Researchers may track the changes in librarian competencies by repeating the survey over time.Another improvement for such a study is to incorporate job ads from other data sources, which can be a good complement.Also, start dates of job ads in this research range from Jan 2009 to Jun 2014.Job ads posted before 2009, in which year "Library 2.0" appeared, can be a good contrast in analyzing changing trends in competencies.
Last but not least, the author clusters job positions with a simplified method so that only a rough conclusion is drawn.Further work may focus on the descriptions of responsibility in full-text.Correlation between responsibilities and requirements is also a good research topic.

Figure 1 .
Figure 1.Overview of data processing.

Full-text of
9573 job ads (sampling units) Context units of 7169 job ads Record units in the form of text Example: A record of scholarly achievement Record units in the form of stemmed word set Example: Record, scholarli, achiev Discard job ads without content units Divide each context unit into record units Transfer record units from text into stemmed word set and remove irrelevant terms

Figure 2 .
Figure 2. Data flow diagram of pre-processing.

Figure 3 .
Figure 3. Example for context unit and record unit.

Figure 4 .
Figure 4. Sketch for a part of the final codebook in this study.The data flow diagram of the codebook construction is shown in Figure 5. ALA's Core Competences of Librarianship divides core competencies of librarianship into 8 categories: Foundations of the Profession, Information Resources, Organization of Recorded Knowledge and Information, Technological Knowledge and Skills, Reference and User Services, Research, Continuing Education and Lifelong Learning, Administration and Management.The authors categorize themes used in Sutton's codebook following the criterion, and categorize qualifications like second post-graduate degree as Qualification (see Table1).Then, the authors manually construct a competency tag, a detailed explanation and a set of rules for each competency themes according to their samples.

Figure 5 .
Figure 5. Data flow diagram of codebook construction.

Figure 8 .
Figure 8. Distribution of job ads for different types of libraries.

Figure 9 .
Figure 9.The distribution of job titles with different frequency.

Figure 10 .Figure 11 .
Figure 10.Minimum degree requirement distribution in job ads on ALA Joblist.

Figure
Figure Reference and user services trends, 2009-2014.

5 . Discussion 5 . 1 .
What Kind of Professional Librarian Positions are in Demand in the Current American Job Market?

Table 2 .
Codebook for basic information of job ads.

Table 4 .
Top 20 job titles with maximum repeat count.

Table 6 .
Top 5 job positions clusters with maximum frequency.