Chapter 3 Data transformation

We have several different datasets from Wikipedia. We will divide them into four different sections and do the data transformation separately. Later, we will use these cleaned data to plot and answer our questions in the introduction.

3.1 iPhone Features Data

The dataset of Apple products features is very large, and it contains a lot of information in various aspects. We will mainly focus on certain features (including Release date, Display, Rear Camera) of iPhone for all available models. Firstly, we will merge several tables of different models, and then separate them into smaller tables for different features.

For release date table, we added another column to store the date information in the correct format. For the display data table, we will focus on the screen size and the resolution information. We added extra column for screen size in inch, resolution x and resolution y.

iPhone Release Date

##     Model             Released         Discontinued        ReleaseDate         ReleaseYear          DisconDate        
##  Length:33          Length:33          Length:33          Min.   :2007-06-29   Length:33          Min.   :2008-07-11  
##  Class :character   Class :character   Class :character   1st Qu.:2014-09-19   Class :character   1st Qu.:2016-09-07  
##  Mode  :character   Mode  :character   Mode  :character   Median :2017-09-22   Mode  :character   Median :2019-09-10  
##                                                           Mean   :2016-11-25                      Mean   :2018-08-25  
##                                                           3rd Qu.:2020-04-24                      3rd Qu.:2021-09-14  
##                                                           Max.   :2021-09-24                      Max.   :2021-12-12

iPhone Display table

##     Model           PixelDensity(ppi)  AspectRatio        TypicalMaxbrightness( cd⁄m2) Contrastratio(typical)
##  Length:23          Length:23          Length:23          Length:23                    Length:23             
##  Class :character   Class :character   Class :character   Class :character             Class :character      
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character             Mode  :character      
##                                                                                                              
##                                                                                                              
##                                                                                                              
##  TrueToneDisplay    ProMotionDisplay   HDR10Content       DolbyVision           Taptic          TypicalMaxbrightness
##  Length:23          Length:23          Length:23          Length:23          Length:23          Length:23           
##  Class :character   Class :character   Class :character   Class :character   Class :character   Class :character    
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character   Mode  :character   Mode  :character    
##                                                                                                                     
##                                                                                                                     
##                                                                                                                     
##   ScreenSizeIn        ResX           ResY     
##  Min.   :4.000   Min.   :1136   Min.   : 640  
##  1st Qu.:5.420   1st Qu.:1792   1st Qu.: 828  
##  Median :5.850   Median :2340   Median :1080  
##  Mean   :5.667   Mean   :2125   Mean   :1035  
##  3rd Qu.:6.060   3rd Qu.:2532   3rd Qu.:1170  
##  Max.   :6.680   Max.   :2778   Max.   :1284

iPhone Rear Camera table

##     Model            iPhone 6S         iPhone 6S Plus     iPhone SE(1st generation)   iPhone 7         iPhone 7 Plus     
##  Length:35          Length:35          Length:35          Length:35                 Length:35          Length:35         
##  Class :character   Class :character   Class :character   Class :character          Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character          Mode  :character   Mode  :character  
##    iPhone 8         iPhone 8 Plus        iPhone X          iPhone XS         iPhone XS Max       iPhone XR        
##  Length:35          Length:35          Length:35          Length:35          Length:35          Length:35         
##  Class :character   Class :character   Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##  iPhone 11 Pro      iPhone 11 Pro Max  iPhone 12 Pro      iPhone 12 Pro Max   iPhone 11         iPhone SE(2nd generation)
##  Length:35          Length:35          Length:35          Length:35          Length:35          Length:35                
##  Class :character   Class :character   Class :character   Class :character   Class :character   Class :character         
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character   Mode  :character   Mode  :character         
##  iPhone 12 Mini      iPhone 12         iPhone 13 Mini      iPhone 13         iPhone 13 Pro      iPhone 13 Pro Max 
##  Length:35          Length:35          Length:35          Length:35          Length:35          Length:35         
##  Class :character   Class :character   Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character   Mode  :character   Mode  :character

iPhone supported months table

##     Model           Months supported to date
##  Length:33          Length:33               
##  Class :character   Class :character        
##  Mode  :character   Mode  :character

3.2 Apple Finance Data

The finance table includes revenue, net income, total assets, and employees information. However, there are some missing data in the table, so we need to replace the word “N/A” with NA in our dataframe. We change the year column to be in the Date format; for the rest columns, we change the type from character to numeric. We also remove the space in the column names.

##       Year         Revenue         NetIncome      TotalAssets       Employees     
##  Min.   :2000   Min.   :  5363   Min.   :  -25   Min.   :  6021   Min.   : 14800  
##  1st Qu.:2005   1st Qu.: 13931   1st Qu.: 1328   1st Qu.: 11516   1st Qu.: 33725  
##  Median :2010   Median : 65225   Median :14013   Median : 75183   Median : 76550  
##  Mean   :2010   Mean   :111160   Mean   :23818   Mean   :142555   Mean   : 77388  
##  3rd Qu.:2015   3rd Qu.:215639   3rd Qu.:45687   3rd Qu.:290345   3rd Qu.:117750  
##  Max.   :2020   Max.   :274515   Max.   :59531   Max.   :375319   Max.   :147000  
##                                                                   NA's   :5

3.3 Customer Satisfaction Data

Initially, we want to use the data from this page, but the format of the data is picture instead of tables. Then we found this table with the same information, and we extracted the table from here and modified the column names. For the Satisfaction Index column, we change the type to numeric to better analyze it.

##     Model           Manufacturer        Satisfaction  
##  Length:24          Length:24          Min.   :75.00  
##  Class :character   Class :character   1st Qu.:79.00  
##  Mode  :character   Mode  :character   Median :81.00  
##                                        Mean   :80.83  
##                                        3rd Qu.:82.00  
##                                        Max.   :85.00