I attended the HR interview, where I was asked about my job experience. A few weeks later, I had the technical interview, where they presented me with various situational technical problems.
They presented a problem related to memory issues and asked me how I would troubleshoot it. They also showed me CloudWatch metrics and asked me to identify the problem. I was asked to explain what a memory leak is. During this session, they provided a code snippet that contained errors in the MJM logs and asked me to analyze what had happened. The issue was ultimately traced back to a memory leak.
Another scenario involved an error that occurred after a GKE upgrade, where the issue was "no resources available" in specific zones (xxx/zones/eu-west-1, eu-west-2). To resolve this, the solution required deleting the affected nodes so that new ones could be deployed. I was also asked about cordon and taint in Kubernetes, which were relevant to this error after the upgrade. Additionally, to mitigate such issues, deploying resources across two availability zones (AZs) was recommended.
I was asked about Istio and was provided with a diagram highlighting an "invalid certificate" error. The issue was that each time the proxy was restarted, the error would temporarily disappear but then move to another pod. Despite the certificate being renewed, the error persisted. To permanently resolve the issue after renewing the certificate, all pods needed to be restarted. This ensured that the "certificate invalid" error would not reappear.