A Study on Big Data Thinking of the Internet of Things-Based Smart-Connected Car in Conjunction with Controller Area Network Bus and 4G-Long Term Evolution

.


Introduction
According to Klynveld Peat Marwick Goerdeler (KPMG)'s Global Automotive Executing Survey 2016 [1], automotive trends are rapidly changing every year.For instance, connectivity and digitalization were ranked tenth in 2015, but were ranked first in 2016.This has a significant meaning due to the fact that vehicles may transform to mobile data rooms, which can lead to virtual product features and services [1].One of the well-known technologies regarding this trend is the Tesla autopilot feature which is composed of autosteer, autopark, driver assistance visualization, etc. [2].Additionally, Tesla vehicles regularly receive over the air software updates.
All these emerging technologies are core and key for smart connected cars, and Figure 1 below depicts the full range of connected car technologies and services [3].Among connected car packages provided by auto makers, the vehicle management feature is selected for further discussion.Additionally, the vehicle management feature in terms of the smart connected car in this paper means the collection and transmission of the information via wireless communications.However, before discussing all the details, the first and most important thing that readers should understand is a fundamental definition and concept of the smart connected car.The most significant criterion to be a smart connected car is connectivity.This connectivity can be provided by a third-party system (smartphone) or the vehicle's own transmitter/receiver unit [4].In this work, however, both connectivity ideas are employed for the system architecture.The first connectivity type refers to Internet of Things (IoT).The main concept of IoT is that objects, sensors, and everyday items generate, exchange, and consume data through network connectivity and computing capability [5].The second connectivity type results from Controller Area Network (CAN) bus communication developed by Bosch (Gerlingen, Germany) [6].Since most modern vehicles are composed of a huge number of electronic sensors, which communicate through CAN bus, this point exactly matches up with the definition of the second connectivity type.Thus, the IoT and CAN bus-based system architecture and actual module development will be proposed in this paper.
This paper is organized as follows: Section 2 concentrates on describing relative technologies and communications in terms of CAN bus, Vehicle-to-Vehicle (V2V), Vehicle-to-Infrastructure (V2I), Vehicle-to-Everything (V2X), and connected cars; Section 3 proposes research directions and a system architecture; Section 4 introduces newly developed CAN bus and data transmission modules with testing results, database (DB) design, and implementation, as well as proposing the design of a mobile application and cloud-based Distributed File System (DFS); Section 5 summarizes the research work and describes research limitations and our future work.Among connected car packages provided by auto makers, the vehicle management feature is selected for further discussion.Additionally, the vehicle management feature in terms of the smart connected car in this paper means the collection and transmission of the information via wireless communications.However, before discussing all the details, the first and most important thing that readers should understand is a fundamental definition and concept of the smart connected car.The most significant criterion to be a smart connected car is connectivity.This connectivity can be provided by a third-party system (smartphone) or the vehicle's own transmitter/receiver unit [4].In this work, however, both connectivity ideas are employed for the system architecture.The first connectivity type refers to Internet of Things (IoT).The main concept of IoT is that objects, sensors, and everyday items generate, exchange, and consume data through network connectivity and computing capability [5].The second connectivity type results from Controller Area Network (CAN) bus communication developed by Bosch (Gerlingen, Germany) [6].Since most modern vehicles are composed of a huge number of electronic sensors, which communicate through CAN bus, this point exactly matches up with the definition of the second connectivity type.Thus, the IoT and CAN bus-based system architecture and actual module development will be proposed in this paper.
This paper is organized as follows: Section 2 concentrates on describing relative technologies and communications in terms of CAN bus, Vehicle-to-Vehicle (V2V), Vehicle-to-Infrastructure (V2I), Vehicle-to-Everything (V2X), and connected cars; Section 3 proposes research directions and a system architecture; Section 4 introduces newly developed CAN bus and data transmission modules with testing results, database (DB) design, and implementation, as well as proposing the design of a mobile application and cloud-based Distributed File System (DFS); Section 5 summarizes the research work and describes research limitations and our future work.

Controller Area Network Bus Network
As briefly described in the previous section, CAN bus was developed by Bosch in the early 1980s due to the needs of many automotive makers [6].The main objective of early CAN bus was to define a standard for network communication between sensors, actuators, controllers, and other nodes in real-time applications and to minimize redundant wiring [6,7].Figure 2 below shows the general idea of CAN bus.

Controller Area Network Bus Network
As briefly described in the previous section, CAN bus was developed by Bosch in the early 1980s due to the needs of many automotive makers [6].The main objective of early CAN bus was to define a standard for network communication between sensors, actuators, controllers, and other nodes in real-time applications and to minimize redundant wiring [6,7].Figure 2 below shows the general idea of CAN bus.One of the well-known technologies regarding this in modern vehicles is embedded systems, also known as Electronic Control Units (ECUs), which consist of more than seventy microprocessors communicating over networks to control a vehicle infotainment system, powertrain, etc. [8].However, microprocessors cannot directly communicate by themselves so a standard protocol has to be defined for transmitting and receiving data packets [6,8].The protocol refers to the Open System Interconnection (OSI) seven layers' model consisting of a physical layer, data link layer, network layer, etc., and the CAN protocol standardizes two lower layers, which are the physical and data link layer [6,8].In addition, the targeted applications in automobiles using CAN bus communication are an anti-lock brake system, driving assistant system, engine control, etc. [9].Note that there are various communication protocols used in the different categories of modern vehicles such as the Local Interconnect Network (LIN) for the body, FlexRay for X-by-wire applications, Bluetooth, etc. [10].The reason for focusing on CAN bus in this paper is due to its speed and cost-effectiveness compared to other protocols.
Next, which CAN is used for collecting vehicle information needs to be examined.As mentioned earlier, a number of ECUs are connected on CAN bus, and two important concepts can be extracted.The first extracted concept is a CAN bus topology in which multiple ECUs communicate to each other through CAN bus [11].The second extracted concept is that there are two different types of CAN which are connected through the gateway shown in Figure 3   One of the well-known technologies regarding this in modern vehicles is embedded systems, also known as Electronic Control Units (ECUs), which consist of more than seventy microprocessors communicating over networks to control a vehicle infotainment system, powertrain, etc. [8].However, microprocessors cannot directly communicate by themselves so a standard protocol has to be defined for transmitting and receiving data packets [6,8].The protocol refers to the Open System Interconnection (OSI) seven layers' model consisting of a physical layer, data link layer, network layer, etc., and the CAN protocol standardizes two lower layers, which are the physical and data link layer [6,8].In addition, the targeted applications in automobiles using CAN bus communication are an anti-lock brake system, driving assistant system, engine control, etc. [9].Note that there are various communication protocols used in the different categories of modern vehicles such as the Local Interconnect Network (LIN) for the body, FlexRay for X-by-wire applications, Bluetooth, etc. [10].The reason for focusing on CAN bus in this paper is due to its speed and cost-effectiveness compared to other protocols.
Next, which CAN is used for collecting vehicle information needs to be examined.As mentioned earlier, a number of ECUs are connected on CAN bus, and two important concepts can be extracted.The first extracted concept is a CAN bus topology in which multiple ECUs communicate to each other through CAN bus [11].The second extracted concept is that there are two different types of CAN which are connected through the gateway shown in Figure 3 below, i.e., high speed CAN and low-speed CAN [10].Two CAN types are categorized by the protocol features and international standards [12].Class A is a low speed network which is used for sensors or executive management, and its speed rate is less than 10 KB per second [12].As shown in Figure 3 above, seat control, door control, and lights control are the main applications of class A. Class B has a medium speed rate, and climate control is the main application of this class.The speed rate of class C is between 12 KB and 1 MB per second or faster than this rate so that high speed CAN is considered from this class [12].Engine control, brake control, speed control, and stability control refer to powertrain management, and those are main applications of class C [12].In addition, International Organization for Standardization (ISO) 11898, Society of Automotive Engineers (SAE) J2284, SAE J1939, etc., are relative standards for high speed CAN bus communication [12].Class D has a more than 2 MB per second speed rate, and most infotainment systems are in this class [12].The main goal of this research is to collect and transmit information such as the vehicle speed, Revolution per Minute (RPM), brake status, etc., so that class C is adopted in this work.

The Overview of Vehicle-to-Vehicle
A V2V communication technology, known as the connected vehicle safety model and Vehicular Ad-hoc Networks (VANETs), was introduced by the U.S Department of Transportation's (DOT) National Highway Traffic Safety Administration (NHTSA) in 2014 [13,14].The primary idea of this technology is for the safety improvement of the light vehicles by enabling vehicles to share information or messages, e.g., the vehicle position, motion, size, etc., with each other through V2V protocols, e.g., Dedicated Short Range Communication (DSRC), on-board computers with sensors, and network interface cards such as Institute of Electrical and Electronics Engineers (IEEE) 802.11p,Bluetooth, etc. [13,14].Note that safety improvement is associated with Forward Collision Warning (FCW), Blind Spot Warning (BSW), Lane Departure Warning (LDW), etc.
The in-vehicle components of the V2V system proposed by the U.S DOT NHTSA are the DSRC radio, Global Positioning System (GPS), memory, safety application ECU, driver-vehicle interface, and a vehicle's internal communication networks [13].In addition to in-vehicle components, the V2V system can be expanded to the V2I applications based on Road Side Equipment (RSE), which may use other communications such as existing 3G and 4G-LTE cellular networks and Wi-Fi [13,14].The overall concept of V2I will be explained in the following section.

The Overview of Vehicle-to-Infrastructure
The V2I technology is mainly focused on an Intelligent Transportation System (ITS).One of the well-known examples in terms of the ITS is an Active Traffic and Demand Management (ATDM) system which was launched by the Washington State Department of Transportation (WSDOT) in August, 2010 [15].Note that the V2I technology is also based on V2V communications.However, since IEEE 802.11p and DSRC in V2V communications are the most suitable for short-range connectivity, heterogeneous wireless technologies such as 3G, 4G Long Term Evolution (4G-LTE), IEEE 802.11, and IEEE 802.16e for long-range connectivity can be used for effective V2I Two CAN types are categorized by the protocol features and international standards [12].Class A is a low speed network which is used for sensors or executive management, and its speed rate is less than 10 KB per second [12].As shown in Figure 3 above, seat control, door control, and lights control are the main applications of class A. Class B has a medium speed rate, and climate control is the main application of this class.The speed rate of class C is between 12 KB and 1 MB per second or faster than this rate so that high speed CAN is considered from this class [12].Engine control, brake control, speed control, and stability control refer to powertrain management, and those are main applications of class C [12].In addition, International Organization for Standardization (ISO) 11898, Society of Automotive Engineers (SAE) J2284, SAE J1939, etc., are relative standards for high speed CAN bus communication [12].Class D has a more than 2 MB per second speed rate, and most infotainment systems are in this class [12].The main goal of this research is to collect and transmit information such as the vehicle speed, Revolution per Minute (RPM), brake status, etc., so that class C is adopted in this work.

The Overview of Vehicle-To-Vehicle
A V2V communication technology, known as the connected vehicle safety model and Vehicular Ad-hoc Networks (VANETs), was introduced by the U.S Department of Transportation's (DOT) National Highway Traffic Safety Administration (NHTSA) in 2014 [13,14].The primary idea of this technology is for the safety improvement of the light vehicles by enabling vehicles to share information or messages, e.g., the vehicle position, motion, size, etc., with each other through V2V protocols, e.g., Dedicated Short Range Communication (DSRC), on-board computers with sensors, and network interface cards such as Institute of Electrical and Electronics Engineers (IEEE) 802.11p,Bluetooth, etc. [13,14].Note that safety improvement is associated with Forward Collision Warning (FCW), Blind Spot Warning (BSW), Lane Departure Warning (LDW), etc.
The in-vehicle components of the V2V system proposed by the U.S DOT NHTSA are the DSRC radio, Global Positioning System (GPS), memory, safety application ECU, driver-vehicle interface, and a vehicle's internal communication networks [13].In addition to in-vehicle components, the V2V system can be expanded to the V2I applications based on Road Side Equipment (RSE), which may use other communications such as existing 3G and 4G-LTE cellular networks and Wi-Fi [13,14].The overall concept of V2I will be explained in the following section.

The Overview of Vehicle-To-Infrastructure
The V2I technology is mainly focused on an Intelligent Transportation System (ITS).One of the well-known examples in terms of the ITS is an Active Traffic and Demand Management (ATDM) system which was launched by the Washington State Department of Transportation (WSDOT) in August, 2010 [15].Note that the V2I technology is also based on V2V communications.However, since IEEE 802.11p and DSRC in V2V communications are the most suitable for short-range connectivity, heterogeneous wireless technologies such as 3G, 4G Long Term Evolution (4G-LTE), IEEE 802.11, and IEEE 802.16e for long-range connectivity can be used for effective V2I communications [14].For this reason, wireless access points, known as Road-Side Units (RSUs), are the key components of V2I communications [14].

The Overview of Vehicle-To-Everything
The overall concept of V2X is to communicate between a vehicle and all other entities including pedestrians, mobile devices, etc.Since V2X covers a wide range of aspects of the smart-connected car, the limitation of short-range connectivity results from IEEE 802.11p and DSRC motivates to explore new communication technologies.To the best of our knowledge, the fastest wireless network is LTE-Advanced (LTE-A), and 5G networks are an actively on-going research area [16,17].Furthermore, 5G is expected to bring new capabilities to the smart-connected car due to extreme broadband, ultra-low latency, and edgeless connectivity [18].The overall concept of V2V, V2I, and V2X is depicted in Figure 4 below.

The Overview of Vehicle-to-Everything
The overall concept of V2X is to communicate between a vehicle and all other entities including pedestrians, mobile devices, etc.Since V2X covers a wide range of aspects of the smart-connected car, the limitation of short-range connectivity results from IEEE 802.11p and DSRC motivates to explore new communication technologies.To the best of our knowledge, the fastest wireless network is LTE-Advanced (LTE-A), and 5G networks are an actively on-going research area [16,17].Furthermore, 5G is expected to bring new capabilities to the smart-connected car due to extreme broadband, ultra-low latency, and edgeless connectivity [18].The overall concept of V2V, V2I, and V2X is depicted in Figure 4 below.

2.5.The Overview of Connected Cars
Numerous studies have proposed an idea for connected cars related to our research.In [19], a Hadoop-based big data solution is proposed for processing vehicle data such as RPM, speed, GPS location, etc., which are obtained by a Bluetooth On-Board Diagnostics (OBD) scan tool.The actual android mobile application is developed and this application enables users to see diagnostic information on the cloud.In [20], the authors propose system implementation in conjunction with European On-Board Diagnostics (EOBD), GPS, a General Packet Radio Service (GPRS), and Universal Mobile Telecommunications System (UMTS) to assess the risk associated with vehicle usage as part of a Pay As You Drive (PAYD) insurance program used for the assessment of insurance premiums.

The Proposed Research Directions and System Architecture
According to the literature review in the previous section, the following research directions are determined.

The Overview of Connected Cars
Numerous studies have proposed an idea for connected cars related to our research.In [19], a Hadoop-based big data solution is proposed for processing vehicle data such as RPM, speed, GPS location, etc., which are obtained by a Bluetooth On-Board Diagnostics (OBD) scan tool.The actual android mobile application is developed and this application enables users to see diagnostic information on the cloud.In [20], the authors propose system implementation in conjunction with European On-Board Diagnostics (EOBD), GPS, a General Packet Radio Service (GPRS), and Universal Mobile Telecommunications System (UMTS) to assess the risk associated with vehicle usage as part of a Pay As You Drive (PAYD) insurance program used for the assessment of insurance premiums.

The Proposed Research Directions and System Architecture
According to the literature review in the previous section, the following research directions are determined.

1.
Although powertrain-related vehicle information, which is used for driving information, is a primary source for big data analysis, both B-CAN and C-CAN are adopted.DB should be designed by normalization so that data redundancy can be avoided.

5.
Cloud-based DFS implementation is required for big data analysis.6.
Since the IoT concept in this research refers to the V2I concept, which is also based on the V2V concept, short-range and long-range connectivity based on Bluetooth and 4G-LTE are employed as the main communication networks.
Based on the defined research directions, the following system architecture is established, as shown in Figure 5.
Symmetry 2017, 9, 152 6 of 14 1.Although powertrain-related vehicle information, which is used for driving information, is a primary source for big data analysis, both B-CAN and C-CAN are adopted.2. DB, which is composed of a driver table, vehicle table, and diagnostics table, needs to be implemented.3. The diagnostics table stores each automaker's diagnostic codes so that drivers are able to check vehicle malfunctions via a mobile application.4. DB should be designed by normalization so that data redundancy can be avoided.5. Cloud-based DFS implementation is required for big data analysis.6.Since the IoT concept in this research refers to the V2I concept, which is also based on the V2V concept, short-range and long-range connectivity based on Bluetooth and 4G-LTE are employed as the main communication networks.
Based on the defined research directions, the following system architecture is established, as shown in Figure 5.According to our proposed system architecture shown in Figure 5 above, there are three main components which need to be developed.The first component is the CAN bus module, through which the vehicle information mostly related to the powertrain can be obtained.However, this CAN bus module has hidden features which can remotely control a vehicle through a mobile application.Since these hidden features are out of the research scope, we will not mention them in the rest of the paper.Yet, once information is obtained by the CAN bus module, such information can be transmitted to a mobile device through Bluetooth and is checked by software on the PC.The second main component is to develop a data transmission module.This data transmission module is developed based on 4G-LTE.Data transmission heads to two targets such as a mobile device and server, but its primary destination is the server.The third main component is a mobile application targeted at two things: (1) to check vehicle malfunction codes by interacting with the server; (2) to transmit the obtained information to a cloud-based DFS via wireless communications.The last expected main component is to implement the cloud-based DFS.According to our proposed system architecture shown in Figure 5 above, there are three main components which need to be developed.The first component is the CAN bus module, through which the vehicle information mostly related to the powertrain can be obtained.However, this CAN bus module has hidden features which can remotely control a vehicle through a mobile application.Since these hidden features are out of the research scope, we will not mention them in the rest of the paper.Yet, once information is obtained by the CAN bus module, such information can be transmitted to a mobile device through Bluetooth and is checked by software on the PC.The second main component is to develop a data transmission module.This data transmission module is developed based on 4G-LTE.Data transmission heads to two targets such as a mobile device and server, but its primary destination is the server.The third main component is a mobile application targeted at two things: (1) to check vehicle malfunction codes by interacting with the server; (2) to transmit the obtained information to a cloud-based DFS via wireless communications.The last expected main component is to implement the cloud-based DFS.

Modules and Software Development
To achieve the first and sixth research directions described in the previous section, OBDII and Bluetooth were applied to develop the CAN bus module.It is important to note that OBDII is internationally standardized by ISO 15765-4, SAE J1939, ISO 9141-2, ISO 14230-4, SAE J1850 PWM, and J1853 VPW.The circuit and PCB board design included the function of the ELM327 chipset, which is a programmed microcontroller developed by ELM electronics (London, ON, Canada).Note that the ELM327 protocol is the PC-to-OBD interface standard.Figure 6 depicts a blueprint of the module.

Modules and Software Development
To achieve the first and sixth research directions described in the previous section, OBDII and Bluetooth were applied to develop the CAN bus module.It is important to note that OBDII is internationally standardized by ISO 15765-4, SAE J1939, ISO 9141-2, ISO 14230-4, SAE J1850 PWM, and J1853 VPW.The circuit and PCB board design included the function of the ELM327 chipset, which is a programmed microcontroller developed by ELM electronics (London, ON, Canada).Note that the ELM327 protocol is the PC-to-OBD interface standard.Figure 6 depicts a blueprint of the module.To communicate between the CAN bus module and data transmission module, a Universal Asynchronous Receiver/Transmitter (UART) is used.Communication between the data transmission module and server is performed through 4G-LTE.Additionally, vehicle information, e.g., RPM, speed, gear, brake status, etc., can be visually checked by the developed software shown in Figure 7a.Development of the CAN bus module and PC software is based on C, IAR Embedded Workbench for ARM (IAREWARM) (IAR Systems, Uppsala, Sweden), and Visual C++ using MS Visual Studio 2010 (Microsoft Corporation, Redmond, WA, USA).Actual testing was conducted, and we confirmed that all the information was successfully received through our developed modules.Figures 7 and 8 show our actual deliverables.To communicate between the CAN bus module and data transmission module, a Universal Asynchronous Receiver/Transmitter (UART) is used.Communication between the data transmission module and server is performed through 4G-LTE.Additionally, vehicle information, e.g., RPM, speed, gear, brake status, etc., can be visually checked by the developed software shown in Figure 7a.Development of the CAN bus module and PC software is based on C, IAR Embedded Workbench for ARM (IAREWARM) (IAR Systems, Uppsala, Sweden), and Visual C++ using MS Visual Studio 2010 (Microsoft Corporation, Redmond, WA, USA).Actual testing was conducted, and we confirmed that all the information was successfully received through our developed modules.Figures 7 and 8 show our actual deliverables.

Modules and Software Development
To achieve the first and sixth research directions described in the previous section, OBDII and Bluetooth were applied to develop the CAN bus module.It is important to note that OBDII is internationally standardized by ISO 15765-4, SAE J1939, ISO 9141-2, ISO 14230-4, SAE J1850 PWM, and J1853 VPW.The circuit and PCB board design included the function of the ELM327 chipset, which is a programmed microcontroller developed by ELM electronics (London, ON, Canada).Note that the ELM327 protocol is the PC-to-OBD interface standard.Figure 6 depicts a blueprint of the module.To communicate between the CAN bus module and data transmission module, a Universal Asynchronous Receiver/Transmitter (UART) is used.Communication between the data transmission module and server is performed through 4G-LTE.Additionally, vehicle information, e.g., RPM, speed, gear, brake status, etc., can be visually checked by the developed software shown in Figure 7a.Development of the CAN bus module and PC software is based on C, IAR Embedded Workbench for ARM (IAREWARM) (IAR Systems, Uppsala, Sweden), and Visual C++ using MS Visual Studio 2010 (Microsoft Corporation, Redmond, WA, USA).Actual testing was conducted, and we confirmed that all the information was successfully received through our developed modules.Figures 7 and 8 show our actual deliverables.

Database Implementation
DB was implemented based on Microsoft Structure Query Language (MS-SQL), but as mentioned in the second research direction, our initial relational DB was designed by three DB tables such as the driver, vehicle, and diagnostics shown in Figure 10 below.

Database Implementation
DB was implemented based on Microsoft Structure Query Language (MS-SQL), but as mentioned in the second research direction, our initial relational DB was designed by three DB tables such as the driver, vehicle, and diagnostics shown in Figure 10 below.The most significant problem of the designed relational DB is that it has a high chance to cause data redundancy.Due to such an issue, we realized the necessity to revise the relational DB with a consideration of the normalization concept so that we applied 1NF, 2NF, and 3NF to the relational DB.To do so, we achieved the forth research direction, and the normalized DB is shown in Figure 11 below.Note that we used a linear underline and dotted underline for a primary key and foreign keys, respectively.Since we used the normalized DB, the vehicle powertrain-related information from now on belongs to the driving information table (Driving_Info.table).Drivers or users need to register their personal information, and such information is stored in the driver table.Yet, five categories of data (i.e., manufacturer name, vehicle year, model, model details, and engine size) need to be predefined by a DB administrator.By doing so, users are able to easily select their vehicle information using a The most significant problem of the designed relational DB is that it has a high chance to cause data redundancy.Due to such an issue, we realized the necessity to revise the relational DB with a of the normalization concept so that we applied 1NF, 2NF, and 3NF to the relational DB.To do so, we achieved the forth research direction, and the normalized DB is shown in Figure 11 below.Note that we used a linear underline and dotted underline for a primary key and foreign keys, respectively.The most significant problem of the designed relational DB is that it has a high chance to cause data redundancy.Due to such an issue, we realized the necessity to revise the relational DB with a consideration of the normalization concept so that we applied 1NF, 2NF, and 3NF to the relational DB.To do so, we achieved the forth research direction, and the normalized DB is shown in Figure 11 below.Note that we used a linear underline and dotted underline for a primary key and foreign keys, respectively.Since we used the normalized DB, the vehicle powertrain-related information from now on belongs to the driving information table (Driving_Info.table).Drivers or users need to register their personal information, and such information is stored in the driver table.Yet, five categories of data (i.e., manufacturer name, vehicle year, model, model details, and engine size) need to be predefined by a DB administrator.By doing so, users are able to easily select their vehicle information using a Since we used the normalized DB, the vehicle powertrain-related information from now on belongs to the driving information table (Driving_Info.table).Drivers or users need to register their personal information, and such information is stored in the driver table.Yet, five categories of data (i.e., manufacturer name, vehicle year, model, model details, and engine size) need to be predefined by a DB administrator.By doing so, users are able to easily select their vehicle information using a dropdown menu.The DB administrator is also required to define diagnostic codes based on generic codes and manufacturer codes.This is for achieving the third research direction.Moreover, diagnostic codes are classified into two types.For instance, if diagnostic codes start with P0XXXX, P0 means powertrain-related generic codes.If diagnostic codes start with P1XXXX, P1 means powertrain-related manufacturer codes.Note that there are four alphabets to indicate which part has a malfunction, i.e., B for body, C for chassis, P for powertrain, and U for user network.Since there are a huge amount of diagnostic data related to generic and manufacturer codes, we initially imported a total of 773 powertrain-related generic codes and 1433 manufacturer codes.The targeted automakers for manufacturer codes were Audi (Ingolstadt, Germany), Mercedes-Benz (Stuttgart, Germany), BMW (Munich, Germany), Hyundai (Seoul, Korea), Kia (Seoul, Korea), Chevrolet (Detroit, MI, USA), and Honda (Tokyo, Japan) because those automakers have the most market shares in Korea and United States.We are still working on importing the diagnostic codes for other manufacturers' codes and generic codes.

Mobile Application
From this section, we mainly focus on proposing system design because we have not developed actual deliverables.The intended mobile application is composed of seven functionalities, i.e., dashboard, diagnostics, graphing, restore, diagnostics logs, preferences, and diagnostics search.To design the mobile application, we referred to a variety of Android applications on the Play Google app (Google, Moutain View, CA, USA) market so that the name of the seven functionalities may change.The purpose of the mobile application is to check vehicle diagnostic codes and driving information in real-time.The planned development platform is OpenXC because it supports open source hardware and software.
The dashboard functionality is intended to check the vehicle speed, RPM, accelerating status, braking status, etc., in real-time.The diagnostics functionality enables users to check a vehicle malfunction in real-time by interacting with the server, and diagnostic results are automatically stored in diagnostic logs.In addition, the diagnostic search functionality helps users find diagnostic codes from the server.The graphing functionality is used for grasping driving habits so that the real-time gas mileage, average gas mileage, fuel consumption, etc., are provided to users.The restore functionality is employed for data backup, and the preferences functionality is related to the application settings.A mock-up of the application interfaces using the Balsamiq mockups (Balsamiq Studios, Sacramento, CA, USA) tool is depicted in Figure 12.
dropdown menu.The DB administrator is also required to define diagnostic codes based on generic codes and manufacturer codes.This is for achieving the third research direction.Moreover, diagnostic codes are classified into two types.For instance, if diagnostic codes start with P0XXXX, P0 means powertrain-related generic codes.If diagnostic codes start with P1XXXX, P1 means powertrain-related manufacturer codes.Note that there are four alphabets to indicate which part has a malfunction, i.e., B for body, C for chassis, P for powertrain, and U for user network.Since there are a huge amount of diagnostic data related to generic and manufacturer codes, we initially imported a total of 773 powertrain-related generic codes and 1433 manufacturer codes.The targeted automakers for manufacturer codes were Audi (Ingolstadt, Germany), Mercedes-Benz (Stuttgart, Germany), BMW (Munich, Germany), Hyundai (Seoul, Korea), Kia (Seoul, Korea), Chevrolet (Detroit, MI, USA), and Honda (Tokyo, Japan) because those automakers have the most market shares in Korea and United States.We are still working on importing the diagnostic codes for other manufacturers' codes and generic codes.

Mobile Application
From this section, we mainly focus on proposing system design because we have not developed actual deliverables.The intended mobile application is composed of seven functionalities, i.e., dashboard, diagnostics, graphing, restore, diagnostics logs, preferences, and diagnostics search.To design the mobile application, we referred to a variety of Android applications on the Play Google app (Google, Moutain View, CA, USA) market so that the name of the seven functionalities may change.The purpose of the mobile application is to check vehicle diagnostic codes and driving information in real-time.The planned development platform is OpenXC because it supports open source hardware and software.
The dashboard functionality is intended to check the vehicle speed, RPM, accelerating status, braking status, etc., in real-time.The diagnostics functionality enables users to check a vehicle malfunction in real-time by interacting with the server, and diagnostic results are automatically stored in diagnostic logs.In addition, the diagnostic search functionality helps users find diagnostic codes from the server.The graphing functionality is used for grasping driving habits so that the real-time gas mileage, average gas mileage, fuel consumption, etc., are provided to users.The restore functionality is employed for data backup, and the preferences functionality is related to the application settings.A mock-up of the application interfaces using the Balsamiq mockups (Balsamiq Studios, Sacramento, CA, USA) tool is depicted in Figure 12.

Proposing A Cloud-Based Distributed File System for Big Data Analysis
We have not started cloud-based DFS implementation for big data analysis, but our initial considerations are as follows: • The main purpose is to share the information of diagnostics and driving.

Proposing A Cloud-Based Distributed File System for Big Data Analysis
We have not started cloud-based DFS implementation for big data analysis, but our initial considerations are as follows: • The main purpose is to share the information of diagnostics and driving.

•
A variety of DFS' need to be examined.

•
Protocols to share and store the information of diagnostics and driving are web service-based.

•
The collected information goes through preprocessing, and a software framework needs to be employed for data processing, analysis, and reporting.

•
A big data analysis tool using R, Python, Matlab, or other tools needs to be deployed [21][22][23].

•
The analyzed result is provided to users through data visualization.
After our considerations were revealed, we realized one issue regarding the implemented SQL-based DB.Our initial plan was that the mobile application interacts with the server in accordance with the user's requests to obtain information on diagnostics and driving.Figure 13 depicts the flow of the diagnostics and graphing functionality using the sequence diagram.

•
Protocols to share and store the information of diagnostics and driving are web service-based.

•
The collected information goes through preprocessing, and a software framework needs to be employed for data processing, analysis, and reporting.

•
A big data analysis tool using R, Python, Matlab, or other tools needs to be deployed [21,22,23].

•
The analyzed result is provided to users through data visualization.
After our considerations were revealed, we realized one issue regarding the implemented SQL-based DB.Our initial plan was that the mobile application interacts with the server in accordance with the user's requests to obtain information on diagnostics and driving.Figure 13 depicts the flow of the diagnostics and graphing functionality using the sequence diagram.The most significant issue with respect to our implemented DB is that DB design and implementation are against the requirements of the cloud-based DFS such as scalability, integrity, velocity, etc., so that we start to examine various DFSs for big data analysis.

Google File System
Google uses a self-developed file system (GFS) to provide a cloud service [24].GFS is Linux-based and is composed of a single master and multiple chunk servers.The single master manages the metadata of the file system and control system operation.Chunk servers process the input and output of a client and store actual data.

Hadoop Distributed File System
Hadoop DFS (HDFS) was developed based on the GFS and is open source-based [25].The HDFS architecture is very similar to the GFS because a single namenode plays a similar role to the single master, and datanodes are considered as chunk servers.

Amazon Web Service
The Amazon Web Service (AWS) is a cloud storage service and composed of an Elastic file System (EFS), Elastic Block Store (EBS), and Simple Storage Service (S3) [26].The EFS provides the storage running on Amazon Compute Cloud (Amazon EC2) instances in the cloud environment.The most significant issue with respect to our implemented DB is that DB design and implementation are against the requirements of the cloud-based DFS such as scalability, integrity, velocity, etc., so that we start to examine various DFSs for big data analysis.

Google File System
Google uses a self-developed file system (GFS) to provide a cloud service [24].GFS is Linux-based and is composed of a single master and multiple chunk servers.The single master manages the metadata of the file system and control system operation.Chunk servers process the input and output of a client and store actual data.

Hadoop Distributed File System
Hadoop DFS (HDFS) was developed based on the GFS and is open source-based [25].The HDFS architecture is very similar to the GFS because a single namenode plays a similar role to the single master, and datanodes are considered as chunk servers.

Amazon Web Service
The Amazon Web Service (AWS) is a cloud storage service and composed of an Elastic file System (EFS), Elastic Block Store (EBS), and Simple Storage Service (S3) [26].The EFS provides the storage running on Amazon Compute Cloud (Amazon EC2) instances in the cloud environment.The EBS provides two types of block storage volumes, i.e., the Solid State Drivers (SSD) volume and Hard Disk Drivers (HDD) volume, in the AWS cloud.The S3 is an object storage in which it is possible to store and search data through the web service interface.

Proposing Distributed File System Design
According to our examination of the DFS for big data analysis, Microsoft Azure would be the strongest DFS candidate for our system due to scalability and compatibility, and we are proposing the DFS design shown in Figure 14.

Proposing Distributed File System Design
According to our examination of the DFS for big data analysis, Microsoft Azure would be the strongest DFS candidate for our system due to scalability and compatibility, and we are proposing the DFS design shown in Figure 14.Once vehicle information is transmitted to and stored in the cloud server via LTE, the acquired data is loaded from the HDFS to Azure HD Insight via Sqoop for the purpose of data analysis.Note that Sqoop is a tool used to import and export data between a DB and HD Insight.There are two objectives of data analysis.The first objective is to provide data reporting and visualization in terms of driving habits, diagnostic logs, etc.The second objective, which is optional, is for prediction models through training data.For example, once enough data is collected to build a training model, we apply a machine learning technique to provide prediction models for giving maintenance events in advance, giving the best route information for saving gas, etc.

Conclusions
Governments and automakers in the world are actively seeking innovative technologies for V2V, V2I, and V2X connectivity, and the fundamental technology to fulfill such connectivity is IoT.This new technical paradigm motivated us to explore new technologies regarding the smart-connected car.In this paper, we successfully developed actual modules based on CAN bus and 4G-LTE to obtain and transmit vehicle powertrain information.The information obtained by modules was also visually checked by our developed software.However, although we implemented SQL-based relational DB Once vehicle information is transmitted to and stored in the cloud server via LTE, the acquired data is loaded from the HDFS to Azure HD Insight via Sqoop for the purpose of data analysis.Note that Sqoop is a tool used to import and export data between a DB and HD Insight.There are two objectives of data analysis.The first objective is to provide data reporting and visualization in terms of driving habits, diagnostic logs, etc.The second objective, which is optional, is for prediction models through training data.For example, once enough data is collected to build a training model, we apply a machine learning technique to provide prediction models for giving maintenance events in advance, giving the best route information for saving gas, etc.

Figure 1 .
Figure 1.The range of connected car technologies and services.

Figure 1 .
Figure 1.The range of connected car technologies and services.

Figure 2 .
Figure 2. General idea of Controller Area Network (CAN) bus.GPS: Global Positioning System below, i.e., high speed CAN and low-speed CAN[10].

Figure 2 .
Figure 2. General idea of Controller Area Network (CAN) bus.GPS: Global Positioning System.

Figure 3 .
Figure 3. High speed and low speed CAN buses.

Figure 3 .
Figure 3. High speed and low speed CAN buses.

Figure 6 .
Figure 6.Blueprint of the module.

Figure 7 .
Figure 7. (a) Software to check vehicle information (b) Actual developed CAN bus module.

Figure 6 .
Figure 6.Blueprint of the module.

Figure 6 .
Figure 6.Blueprint of the module.

Figure 7 .
Figure 7. (a) Software to check vehicle information (b) Actual developed CAN bus module.Figure 7. (a) Software to check vehicle information (b) Actual developed CAN bus module.

Figure 7 .
Figure 7. (a) Software to check vehicle information (b) Actual developed CAN bus module.Figure 7. (a) Software to check vehicle information (b) Actual developed CAN bus module.

Figure 8 .
Figure 8.(a) CAN Bus module for testing (b) Actual data transmission module.

Figure 9 .
Figure 9. (a) Radiated disturbance result based on the peak detector mode only (b) Radiated disturbance result based on the peak and average detector modes.

Figure 8 .
Figure 8.(a) CAN Bus module for testing (b) Actual data transmission module.

Table 1 .Figure 8 .
Figure 8.(a) CAN Bus module for testing (b) Actual data transmission module.

Figure 9 .
Figure 9. (a) Radiated disturbance result based on the peak detector mode only (b) Radiated disturbance result based on the peak and average detector modes.Figure 9. (a) Radiated disturbance result based on the peak detector mode only (b) Radiated disturbance result based on the peak and average detector modes.

Figure 9 .
Figure 9. (a) Radiated disturbance result based on the peak detector mode only (b) Radiated disturbance result based on the peak and average detector modes.Figure 9. (a) Radiated disturbance result based on the peak detector mode only (b) Radiated disturbance result based on the peak and average detector modes.

Figure 11 .
Figure 11.The normalized DB.DOB: Date of Birth
DB was implemented based on Microsoft Structure Query Language (MS-SQL), but as mentioned in the second research direction, our initial relational DB was designed by three DB tables such as the driver, vehicle, and diagnostics shown in Figure10below.

Figure 11 .
Figure 11.The normalized DB.DOB: Date of Birth

Figure 13 .
Figure 13.Sequence diagram of the diagnostics and graphing functionality.

Figure 13 .
Figure 13.Sequence diagram of the diagnostics and graphing functionality.

4. 5
.4.Microsoft Azure Microsoft provides the Azure cloud platform service based on Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) [27].The Windows Azure platform consists of Windows Azure, SQL Azure, and Windows Azure platform AppFabric.The most remarkable difference between Microsoft Azure and Amazon EC2 is a cloud stack; in other words, while Microsoft Azure provides both IaaS and Paas, Amazon EC2 provides only PaaS.One interesting technology of Microsoft Azure is to support NoSQL.NoSQL, in contrast with the traditional Relational Database Management System (RDBMS), does not need to join database tables and simplifies data access.

Symmetry 2017, 9 , 152 12 of 14 Hard
Disk Drivers (HDD) volume, in the AWS cloud.The S3 is an object storage in which it is possible to store and search data through the web service interface.4.5.4.Microsoft Azure Microsoft provides the Azure cloud platform service based on Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) [27].The Windows Azure platform consists of Windows Azure, SQL Azure, and Windows Azure platform AppFabric.The most remarkable difference between Microsoft Azure and Amazon EC2 is a cloud stack; in other words, while Microsoft Azure provides both IaaS and Paas, Amazon EC2 provides only PaaS.One interesting technology of Microsoft Azure is to support NoSQL.NoSQL, in contrast with the traditional Relational Database Management System (RDBMS), does not need to join database tables and simplifies data access.

Table 1 .
Testing criteria for broadband and narrowband radiated disturbances.

Table 1 .
Testing criteria for broadband and narrowband radiated disturbances.