Untidy version of DSjobtracker(DSraw)
World citities
city city_ascii lat lng country iso2 iso3
1 Tokyo Tokyo 35.6850 139.7514 Japan JP JPN
2 New York New York 40.6943 -73.9249 United States US USA
3 Mexico City Mexico City 19.4424 -99.1310 Mexico MX MEX
4 Mumbai Mumbai 19.0170 72.8570 India IN IND
5 São Paulo Sao Paulo -23.5587 -46.6250 Brazil BR BRA
6 Delhi Delhi 28.6700 77.2300 India IN IND
7 Shanghai Shanghai 31.2165 121.4365 China CN CHN
8 Kolkata Kolkata 22.4950 88.3247 India IN IND
9 Los Angeles Los Angeles 34.1139 -118.4068 United States US USA
10 Dhaka Dhaka 23.7231 90.4086 Bangladesh BD BGD
admin_name capital population id
1 TÃ…\215kyÃ…\215 primary 35676000 1392685764
2 New York 19354922 1840034016
3 Ciudad de México primary 19028000 1484247881
4 MahÄ\201rÄ\201shtra admin 18978000 1356226629
5 São Paulo admin 18845000 1076532519
6 Delhi admin 15926000 1356872604
7 Shanghai admin 14987000 1156073548
8 West Bengal admin 14787000 1356060520
9 California 12815475 1840020491
10 Dhaka primary 12797394 1050529279
# A tibble: 6 x 152
ID Consultant DateRetrieved DatePublished Job_title Company R SAS
<dbl> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
1 1 Thiyanga 05/08/2020 <NA> <NA> <NA> 1 1
2 2 Jayani 07/08/2020 31/07/2020 Junior D~ Dialog~ 1 0
3 3 Jayani 07/08/2020 06/08/20 Engineer~ London~ 0 0
4 4 Jayani 07/08/2020 24/07/2020 CI-Stati~ E.D. B~ 1 1
5 5 Jayani 07/08/2020 24/07/2020 DA-Data ~ E.D. B~ 0 1
6 6 Jayani 07/08/2020 13/08/2020 Data Sci~ Emirat~ 1 0
# ... with 144 more variables: SPSS <dbl>, Python <dbl>, MAtlab <dbl>,
# Scala <dbl>, `C#` <dbl>, `MS Word` <dbl>, `Ms Excel` <dbl>, `OLE/DB` <dbl>,
# `Ms Access` <dbl>, `Ms PowerPoint` <dbl>, Spreadsheets <dbl>,
# Data_visualization <dbl>, Presentation_Skills <dbl>, Communication <dbl>,
# BigData <dbl>, Data_warehouse <dbl>, cloud_storage <dbl>,
# Google_Cloud <dbl>, AWS <dbl>, Machine_Learning <dbl>, `Deep
# Learning` <dbl>, Computer_vision <dbl>, Java <dbl>, `C++` <dbl>, C <dbl>,
# `Linux/Unix` <dbl>, SQL <dbl>, NoSQL <dbl>, RDBMS <dbl>, Oracle <dbl>,
# MySQL <dbl>, PHP <dbl>, Flash_Actionscript <dbl>, SPL <dbl>,
# web_design_and_development_tools <dbl>, Wordpress <dbl>, AI <dbl>,
# `Natural_Language_Processing(NLP)` <dbl>, `Microsoft Power BI` <dbl>,
# Google_Analytics <dbl>, graphics_and_design_skills <dbl>,
# Data_marketing <dbl>, SEO <dbl>, Content_Management <dbl>, Tableau <dbl>,
# D3 <dbl>, Alteryx <dbl>, KNIME <dbl>, Spotfire <dbl>, Spark <dbl>,
# S3 <dbl>, Redshift <dbl>, DigitalOcean <dbl>, Javascript <dbl>,
# Kafka <dbl>, Storm <dbl>, Bash <dbl>, Hadoop <dbl>, Data_Pipelines <dbl>,
# MPP_Platforms <dbl>, Qlik <dbl>, Pig <dbl>, Hive <dbl>, Tensorflow <dbl>,
# `Map/Reduce` <dbl>, Impala <dbl>, Solr <dbl>, Teradata <dbl>,
# MongoDB <dbl>, Elasticsearch <dbl>, YOLO <dbl>, `agile execution` <dbl>,
# Data_management <dbl>, pyspark <dbl>, Data_mining <dbl>,
# Data_science <dbl>, Web_Analytic_tools <dbl>, IOT <dbl>,
# Numerical_Analysis <dbl>, Economic <dbl>, Finance_Knowledge <dbl>,
# Investment_Knowledge <dbl>, Problem_Solving <dbl>, Korean_language <dbl>,
# `Bash\\Linux Scripting` <dbl>, Knowledge_in <chr>, Experience <chr>,
# City <chr>, Location <chr>, Educational_qualifications <chr>, Salary <chr>,
# Team_Handling <dbl>, Debtor_reconcilation <dbl>, Payroll_management <dbl>,
# Bayesian <dbl>, Optimization <dbl>, `Bahasa Malaysia` <dbl>, `English
# proficiency` <chr>, URL <chr>, Search_Term <chr>, ...
Observations: 551
Variables: 152
$ ID <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1...
$ Consultant <chr> "Thiyanga", "Jayani", "Jayani", ...
$ DateRetrieved <chr> "05/08/2020", "07/08/2020", "07/...
$ DatePublished <chr> NA, "31/07/2020", "06/08/20", "2...
$ Job_title <chr> NA, "Junior Data Scientist", "En...
$ Company <chr> NA, "Dialog Axiata PLC", "London...
$ R <dbl> 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0,...
$ SAS <dbl> 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0,...
$ SPSS <dbl> NA, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0...
$ Python <dbl> 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0,...
$ MAtlab <dbl> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
$ Scala <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
$ `C#` <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ `MS Word` <dbl> NA, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0...
$ `Ms Excel` <dbl> NA, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0...
$ `OLE/DB` <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ `Ms Access` <dbl> NA, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0...
$ `Ms PowerPoint` <dbl> NA, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0...
$ Spreadsheets <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Data_visualization <dbl> NA, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0...
$ Presentation_Skills <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Communication <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ BigData <dbl> NA, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1...
$ Data_warehouse <dbl> NA, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ cloud_storage <dbl> NA, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Google_Cloud <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ AWS <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Machine_Learning <dbl> NA, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1...
$ `Deep Learning` <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Computer_vision <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
$ Java <dbl> NA, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1...
$ `C++` <dbl> NA, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
$ C <dbl> NA, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
$ `Linux/Unix` <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ SQL <dbl> 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1,...
$ NoSQL <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ RDBMS <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Oracle <dbl> NA, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0...
$ MySQL <dbl> NA, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1...
$ PHP <dbl> NA, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0...
$ Flash_Actionscript <dbl> NA, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0...
$ SPL <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ web_design_and_development_tools <dbl> NA, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0...
$ Wordpress <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ AI <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ `Natural_Language_Processing(NLP)` <dbl> NA, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0...
$ `Microsoft Power BI` <dbl> NA, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0...
$ Google_Analytics <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ graphics_and_design_skills <dbl> NA, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0...
$ Data_marketing <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ SEO <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Content_Management <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Tableau <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0...
$ D3 <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0...
$ Alteryx <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ KNIME <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Spotfire <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Spark <dbl> NA, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0...
$ S3 <dbl> NA, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
$ Redshift <dbl> NA, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
$ DigitalOcean <dbl> NA, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
$ Javascript <dbl> NA, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
$ Kafka <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Storm <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Bash <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Hadoop <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1...
$ Data_Pipelines <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ MPP_Platforms <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Qlik <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Pig <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Hive <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1...
$ Tensorflow <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ `Map/Reduce` <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Impala <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Solr <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Teradata <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ MongoDB <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Elasticsearch <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ YOLO <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ `agile execution` <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1...
$ Data_management <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ pyspark <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Data_mining <dbl> NA, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
$ Data_science <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0...
$ Web_Analytic_tools <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ IOT <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Numerical_Analysis <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Economic <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Finance_Knowledge <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Investment_Knowledge <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Problem_Solving <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Korean_language <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ `Bash\\Linux Scripting` <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Knowledge_in <chr> NA, "Not_define", "Elasticsearch...
$ Experience <chr> "4+", "2-3", "1-2", "2+", "Not_d...
$ City <chr> NA, "Colombo", "Colombo", "Colom...
$ Location <chr> "NY", "SL", "SL", "SL", "SL", "M...
$ Educational_qualifications <chr> NA, "Degree in Engineering / IT ...
$ Salary <chr> NA, "Not_define", "Not_define", ...
$ Team_Handling <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Debtor_reconcilation <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Payroll_management <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Bayesian <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Optimization <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ `Bahasa Malaysia` <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ `English proficiency` <chr> NA, "Not_define", "Not_define", ...
$ URL <chr> NA, "https://www.google.com/sear...
$ Search_Term <chr> NA, "Data Analysis Jobs in Sri L...
$ X109 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X110 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X111 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X112 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X113 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X114 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X115 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X116 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X117 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X118 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X119 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X120 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X121 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X122 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X123 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X124 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X125 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X126 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X127 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X128 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X129 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X130 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X131 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X132 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X133 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X134 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X135 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X136 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X137 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X138 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X139 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X140 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X141 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X142 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X143 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X144 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X145 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X146 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X147 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X148 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X149 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X150 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X151 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
$ X152 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, ...
###Untidy columns
ID
Consultant - To extract complete cases
Software columns -R, Python, SAS etc.
Job_title
Experience
Location
Educational_qualifications
###Softwares
###Job_title
###Experience
###Location
###Educational_qualifications
ID Consultant DateRetrieved DatePublished
1 1 Thiyanga 05/08/2020 <NA>
320 2 Jayani 07/08/2020 31/07/2020
321 3 Jayani 07/08/2020 06/08/20
322 4 Jayani 07/08/2020 24/07/2020
95 5 Jayani 07/08/2020 24/07/2020
244 6 Jayani 07/08/2020 13/08/2020
Job_title
1 <NA>
320 Junior Data Scientist
321 Engineer, Analytics & Data Science
322 CI-Statistical Analyst/Business Analyst-CMB
95 DA-Data Analyst-CMB
244 Data Scientist
Company R SAS SPSS Python MAtlab
1 <NA> 1 1 0 1 1
320 Dialog Axiata PLC 1 0 0 1 0
321 London Stock Exchange Group plc3.1 0 0 0 1 0
322 E.D. Bullard Company 1 1 1 0 0
95 E.D. Bullard Company 0 1 1 0 0
244 Emirates Center for Strategic Studies and Research 1 0 0 1 0
Scala C. MS_Word Ms_Excel OLE_DB Ms_Access Ms_PowerPoint Spreadsheets
1 0 0 0 0 0 0 0 0
320 0 0 0 0 0 0 0 0
321 0 0 0 0 0 0 0 0
322 0 0 0 0 0 0 0 0
95 0 0 1 1 0 1 1 0
244 0 0 0 0 0 0 0 0
Data_visualization Presentation_Skills Communication BigData Data_warehouse
1 0 0 0 0 0
320 1 0 0 1 1
321 1 0 0 1 0
322 0 0 0 0 0
95 0 0 0 0 0
244 0 0 0 0 0
cloud_storage Google_Cloud AWS Machine_Learning Deep_Learning
1 0 0 0 0 0
320 1 0 0 1 0
321 0 0 0 1 0
322 0 0 0 0 0
95 0 0 0 0 0
244 0 0 0 1 0
Computer_vision Java C.. C Linux_Unix SQL NoSQL RDBMS Oracle MySQL PHP
1 0 0 0 0 0 1 0 0 0 0 0
320 0 0 0 0 0 0 0 0 0 0 0
321 0 0 0 0 0 0 0 0 0 0 0
322 0 0 0 0 0 1 0 0 0 0 0
95 0 0 0 0 0 1 0 0 1 1 0
244 0 1 1 1 0 1 0 0 0 0 0
Flash_Actionscript SPL web_design_and_development_tools Wordpress AI
1 0 0 0 0 0
320 0 0 0 0 0
321 0 0 0 0 0
322 0 0 0 0 0
95 0 0 0 0 0
244 0 0 0 0 0
Natural_Language_Processing.NLP. Microsoft_Power_BI Google_Analytics
1 0 0 0
320 0 0 0
321 1 1 0
322 0 0 0
95 0 0 0
244 0 0 0
graphics_and_design_skills Data_marketing SEO Content_Management Tableau D3
1 0 0 0 0 0 0
320 0 0 0 0 0 0
321 0 0 0 0 0 0
322 0 0 0 0 0 0
95 0 0 0 0 0 0
244 0 0 0 0 0 0
Alteryx KNIME Spotfire Spark S3 Redshift DigitalOcean Javascript Kafka
1 0 0 0 0 0 0 0 0 0
320 0 0 0 0 0 0 0 0 0
321 0 0 0 0 0 0 0 0 0
322 0 0 0 0 0 0 0 0 0
95 0 0 0 0 0 0 0 0 0
244 0 0 0 1 1 1 1 1 0
Storm Bash Hadoop Data_Pipelines MPP_Platforms Qlik Pig Hive Tensorflow
1 0 0 0 0 0 0 0 0 0
320 0 0 0 0 0 0 0 0 0
321 0 0 0 0 0 0 0 0 0
322 0 0 0 0 0 0 0 0 0
95 0 0 0 0 0 0 0 0 0
244 0 0 0 0 0 0 0 0 0
Map_Reduce Impala Solr Teradata MongoDB Elasticsearch YOLO agile_execution
1 0 0 0 0 0 0 0 0
320 0 0 0 0 0 0 0 0
321 0 0 0 0 0 0 0 0
322 0 0 0 0 0 0 0 0
95 0 0 0 0 0 0 0 0
244 0 0 0 0 0 0 0 0
Data_management pyspark Data_mining Data_science Web_Analytic_tools IOT
1 0 0 0 0 0 0
320 0 0 0 0 0 0
321 0 0 0 0 0 0
322 0 0 0 0 0 0
95 0 0 0 0 0 0
244 0 0 1 0 0 0
Numerical_Analysis Economic Finance_Knowledge Investment_Knowledge
1 0 0 0 0
320 0 0 0 0
321 0 0 0 0
322 0 0 0 0
95 0 0 0 0
244 0 0 0 0
Problem_Solving Korean_language Bash_Linux_Scripting
1 0 0 0
320 0 0 0
321 0 0 0
322 0 0 0
95 0 0 0
244 0 0 0
Knowledge_in Experience City Location
1 <NA> 4+ <NA> NY
320 <NA> 2-3 Colombo LK
321 Elasticsearch, Logstash, Kibana 1-2 Colombo LK
322 <NA> 2+ Colombo LK
95 <NA> <NA> Colombo LK
244 <NA> 5-7 Kuala Lumpur Malaysia
Educational_qualifications
1 <NA>
320 Degree in Engineering / IT or specialized in Computer Science / Statistics from a recognized university or institute
321 Degree in Statistics / Mathematics / Computer Science.
322 Undergraduate degree in statistics, mathematics or engineering
95 Bachelor's in Information Management, Information Technology, Computing, Mathematics, Statistics, or related fields
244 Master<U+0092>s or PHD in Statistics, Mathematics, Computer Science or another quantitative field
Salary Team_Handling Debtor_reconcilation Payroll_management Bayesian
1 <NA> 0 0 0 0
320 <NA> 0 0 0 0
321 <NA> 0 0 0 0
322 <NA> 0 0 0 0
95 <NA> 0 0 0 0
244 <NA> 0 0 0 0
Optimization Bahasa_Malaysia English_proficiency
1 0 0 <NA>
320 0 0 <NA>
321 0 0 <NA>
322 0 0 <NA>
95 0 0 <NA>
244 0 0 <NA>
URL
1 <NA>
320 https://www.google.com/search?sxsrf=ALeKk00MUun1FouYtWJYm7L0o3wlM5pWbA:1596811359019&source=hp&ei=XmgtX9XyO-G_8QOttrSQAg&q=latest+jobs+for+data+scientist&oq=Latest+Jobs+for+data+scie&gs_lcp=CgZwc3ktYWIQAxgAMggIIRAWEB0QHjIICCEQFhAdEB4yCAghEBYQHRAeMggIIRAWEB0QHjIICCEQFhAdEB4yCAghEBYQHRAeMggIIRAWEB0QHjIICCEQFhAdEB4yCAghEBYQHRAeMggIIRAWEB0QHjoHCCMQ6gIQJzoECCMQJzoICAAQkQIQiwM6CAgAELEDEIMBOggILhCxAxCDAToFCAAQsQM6DggAELEDEIMBEJECEIsDOgsIABCxAxCDARCLAzoHCAAQAxCLAzoICC4QsQMQiwM6CAgAELEDEIsDOgUIABCLAzoCCAA6BggAEBYQHjoFCCEQoAFQ4RhY87gBYJ_IAWgCcAB4AIABwgOIAfQwkgEKMC4xNC43LjQuM5gBAKABAaoBB2d3cy13aXqwAQq4AQI&sclient=psy-ab&ibp=htl;jobs&sa=X&ved=2ahUKEwi7iIn-qYnrAhXS7XMBHR2PCx8Qp4wCMAB6BAgLEAE#fpstate=tldetail&htivrt=jobs&htiq=latest+jobs+for+data+scientist&htidocid=G184piKqa2o_fj-gAAAAAA%3D%3D&sxsrf=ALeKk00mvUvmmBGPtIAJqR8AKbUqgn_goA:1596811391427
321 https://www.glassdoor.com/Job/sri-lanka-statistics-jobs-SRCH_IL.0,9_IN45_KO10,20.htm
322 https://www.glassdoor.com/Job/sri-lanka-statistics-jobs-SRCH_IL.0,9_IN45_KO10,20.htm
95 https://www.glassdoor.com/Job/sri-lanka-statistics-jobs-SRCH_IL.0,9_IN45_KO10,20.htm
244 https://www.glassdoor.com/Job/jobs.htm?suggestCount=0&suggestChosen=false&clickSource=searchBtn&typedKeyword=&locT=N&locId=170&jobType=&context=Jobs&sc.keyword=statistics&dropdown=0
Search_Term Job_Titles
1 <NA> <NA>
320 Data Analysis Jobs in Sri Lanka junior data scientist
321 Data Analysis Jobs in Sri Lanka engineer analytics & data science
322 Data Analysis Jobs in Sri Lanka ci statistical analyst business analyst cmb
95 Data Analysis Jobs in Sri Lanka da data analyst cmb
244 Statistics top jobs in Malaysia data scientist
Job_Category Minimum_Years_of_experience Experience_Category Job_Country
1 Unimportant NA Two or less years United States
320 Data Science 2 Two or less years Sri Lanka
321 Data Science 1 Two or less years Sri Lanka
322 Data Analyst 2 Two or less years Sri Lanka
95 Data Analyst NA Two or less years Sri Lanka
244 Data Science NA Two or less years Malaysia
Edu_Category
1 <NA>
320 Some Degree
321 Some Degree
322 Some Degree
95 Min_Bsc
244 Min_Master
Observations: 435
Variables: 114
$ ID <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,...
$ Consultant <chr> "Thiyanga", "Jayani", "Jayani", "J...
$ DateRetrieved <chr> "05/08/2020", "07/08/2020", "07/08...
$ DatePublished <chr> NA, "31/07/2020", "06/08/20", "24/...
$ Job_title <chr> NA, "Junior Data Scientist", "Engi...
$ Company <chr> NA, "Dialog Axiata PLC", "London S...
$ R <dbl> 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0...
$ SAS <dbl> 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0...
$ SPSS <dbl> 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0...
$ Python <dbl> 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0...
$ MAtlab <dbl> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Scala <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ C. <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ MS_Word <dbl> 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1...
$ Ms_Excel <dbl> 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1...
$ OLE_DB <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Ms_Access <dbl> 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1...
$ Ms_PowerPoint <dbl> 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0...
$ Spreadsheets <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Data_visualization <dbl> 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0...
$ Presentation_Skills <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Communication <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ BigData <dbl> 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0...
$ Data_warehouse <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ cloud_storage <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Google_Cloud <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ AWS <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Machine_Learning <dbl> 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0...
$ Deep_Learning <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Computer_vision <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Java <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0...
$ C.. <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0...
$ C <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0...
$ Linux_Unix <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ SQL <dbl> 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0...
$ NoSQL <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ RDBMS <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Oracle <dbl> 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0...
$ MySQL <dbl> 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0...
$ PHP <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0...
$ Flash_Actionscript <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0...
$ SPL <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ web_design_and_development_tools <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0...
$ Wordpress <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ AI <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Natural_Language_Processing.NLP. <dbl> 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0...
$ Microsoft_Power_BI <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Google_Analytics <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ graphics_and_design_skills <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0...
$ Data_marketing <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ SEO <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Content_Management <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Tableau <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0...
$ D3 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0...
$ Alteryx <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ KNIME <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Spotfire <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Spark <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0...
$ S3 <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0...
$ Redshift <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0...
$ DigitalOcean <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0...
$ Javascript <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0...
$ Kafka <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Storm <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Bash <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Hadoop <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0...
$ Data_Pipelines <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ MPP_Platforms <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Qlik <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Pig <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Hive <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0...
$ Tensorflow <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Map_Reduce <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Impala <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Solr <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Teradata <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ MongoDB <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Elasticsearch <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ YOLO <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ agile_execution <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0...
$ Data_management <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ pyspark <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Data_mining <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0...
$ Data_science <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0...
$ Web_Analytic_tools <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ IOT <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Numerical_Analysis <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Economic <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Finance_Knowledge <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Investment_Knowledge <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Problem_Solving <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Korean_language <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Bash_Linux_Scripting <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Knowledge_in <chr> NA, NA, "Elasticsearch, Logstash, ...
$ Experience <chr> "4+", "2-3", "1-2", "2+", NA, "5-7...
$ City <chr> NA, "Colombo", "Colombo", "Colombo...
$ Location <chr> "NY", "LK", "LK", "LK", "LK", "Mal...
$ Educational_qualifications <chr> NA, "Degree in Engineering / IT or...
$ Salary <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ Team_Handling <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Debtor_reconcilation <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Payroll_management <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Bayesian <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Optimization <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ Bahasa_Malaysia <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ English_proficiency <chr> NA, NA, NA, NA, NA, NA, "1", NA, N...
$ URL <chr> NA, "https://www.google.com/search...
$ Search_Term <chr> NA, "Data Analysis Jobs in Sri Lan...
$ Job_Titles <chr> NA, "junior data scientist", "engi...
$ Job_Category <chr> "Unimportant", "Data Science", "Da...
$ Minimum_Years_of_experience <dbl> NA, 2, 1, 2, NA, NA, NA, NA, 1, NA...
$ Experience_Category <chr> "Two or less years", "Two or less ...
$ Job_Country <chr> "United States", "Sri Lanka", "Sri...
$ Edu_Category <chr> NA, "Some Degree", "Some Degree", ...
Author:
Jayani Lakshika Piyadi Gamage
Link to the Git-repository: