King Data Scientist Interview Questions | Glassdoor

# King Data Scientist Interview Questions

## Interviews at King

21 Interview Reviews

Experience
47%
24%
29%

### Getting an Interview   21 Interview Reviews

Getting an Interview
72%
11%
11%
6

3.1
Average

Hard
Average
Easy

More

## Data Scientist Interview

Declined Offer
Neutral Experience
Average Interview

Application

I applied online. The process took 4+ weeks. I interviewed at King (Berlin (Germany)) in April 2017.

Interview

I received a test as a word doc and was asked to complete it in 2 hours and send it back as a word doc. It felt a little unprofessional, but HR were at least nice. I didn't get past the test so I assume my answers weren't good enough.

Interview Questions

• Question 1. (5-20 minutes)
A cube is painted Green on all six sides. It is divided into 125 (=5 x 5 x 5) equal smaller cubes.

Find:
1.1 The number of smaller cubes having a) 3 faces coloured?
b) Exactly 2 faces coloured? c) Exactly 1 face coloured? d) 0 faces coloured?
1.2 All 125 cubes are put into a bag. If a single cube is selected at random from the bag, find probability of picking a cube having 1 or more Green faces
1.3 What is the average number of Green faces on a cube?

In the above situation N=5, (with N^3 =125).
1.4 For general N, give a formula for the number of smaller cubes with exactly 2 faces coloured 1.5 For what values of N is this formula correct?   Answer Question
• Question 2. (5>10 minutes)

Write a program which calculates the sum of the first 10 Fibonacci numbers that are bigger than 1000.

You may use pseudoGcode or any programming language you like (Java, C, C++, R, Python, etc) G please state what language you are using. [Note: Fibonacci numbers are the following series: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ... where each term is the sum of the previous two terms]   Answer Question
• Q 3.1: In SQL, what statement will count the total number of rows in this table?

Q 3.2: In SQL, what statement will count the total number of rows corresponding to logins on 1st May 2015?

Q 3.3: In SQL, what statement will count the number of unique customers?

Q 3.4: In SQL, what statement would be used to create the above table (without any data in it).
Q 3.5: In SQL, what is the difference between an inner join and an outer join? Please give an example of some circumstances where you would recommend to use an outer join?   Answer Question
• Question 4: (5>10 minutes)

The chart below represents the distribution of play time in a day, for a sample of our players. The points a, b and c mark the values of three measures used to summarise the data.

Q 4.1: What standard measures are a, b and c most likely denoting? Please explain why. Q 4.2: If you drew n samples from this distribution and measured their mean, then repeated that many times, how would you expect the distribution of those sample means to differ from this distribution? Q 4.3: Would its standard deviation be bigger, smaller, or the same as this distribution's standard deviation and why?   Answer Question
• Question 5: (20>50 minutes) (suggest you read the whole question before starting on it)

SITO elevators have a system to simulate different elevator control mechanisms in their buildings. Let us assume a building has M elevators and N floors.
You are in charge of measuring the performance of different elevator control mechanisms and so will need to design the data model to capture observed data and the measurements you would calculate on that data model. Assume a typical simulation run proceeds over a 24 hour period and you are allowed to observe as much as you like (when/where each elevator is, how many passengers, where the passengers are, when/where they arrive/depart, etc – if in doubt assume you can observe it). From your detailed data model you will then need (with very simple calculations) to determine various performance measures, for example:

• Average waiting time per passenger • Average journey time per passenger • ... etc

Q 5.1: List the different stakeholders who would be interested in elevator performance (a “stakeholder” is any person or group who have an interest in or may be affected by some aspect of elevator performance).
Q 5.2: List other performance measures that it would be useful or important to measure – make sure these cover all of the stakeholders. (Hint: there are lots and lots of these. Aim for 10 or more...).
Q 5.3: What would a suitable data representation look like? Please design a series of tables (as would be suitable to put in a database or spreadsheet). Make sure that the data representation (with very simple arithmetic calculations) is adequate to calculate the above measures, and any other measures that you deem important (and that those
calculations are fairly easy and unambiguous). Please point out any problems you might expect to arise with your data model.
Q 5.4: For “Average waiting time per passenger” and at least 2 other performance measures, describe how they can be easily calculated from your data model. Preferably write the SQL code you would use to calculate the waiting and journey times.   Answer Question