I applied online. The process took 4 days. I interviewed at Impetus Technologies (Noida) in Aug 2021
Interview
Technical Round. It was around 30 min to 40 min. Started with Spark concepts and then moved to spark coding, then python coding and then writing SQL queries. It was around 30 min to 40 min. Started with Spark concepts and then moved to spark coding, then python coding and then writing SQL queries.
Interview questions [1]
Question 1
Q: 1,2,4,8,16,32,64.....∞ these numbers comes in power of two (Formula: 2^x where x>=0) so the method should return true for such numbers otherwise it should return false
Input: any no.
Output: True/False
Q: Original EMP Table:
ID NAME DEP
1 abc N/W
2 def N/W
3 ghi S/W
4 jkl S/W
Incremental Data:
ID NAME DEP
1 abc S/W
5 mno N/W
How will you find delta data and migrate it to EMP table.
Q: Name~|Age~|Comment
Df = spark.read.option(“delimiter”,”~|”).csv(“")
Q. Type of dist key in Redshift?
Q. Difference between wide and narrow transformation?
Q. How broadcast join works?
Q. How you do tunning in spark?
Q. How partitioning works in spark?
Q.
Load()
filter()
map()
filter()
count()
filter()
collect()
How many stage and job is there?
Q. If we have input.csv, we need to find the output. File and desired output are given below.
username, mobile
user1,999999991:888888882
user3,777777771
user2,777777234:823232351
user5,734452343:943433434:834323434
user1,999999991:9994433777
output
user1:3
user2:2
user3:1