# ASIC Design Engineer Interview Questions

ASIC design engineer interview questions shared by candidates

## Top Interview Questions

### ASIC Design Engineer at NVIDIA was asked...

design a full adder with 2-1 mux 7 AnswersFull Adder can be implemented by two half adder; a half adder can be implemented by a XOR and AND gate. XOR and AND gate can be implemented by 2:1 MUX. full adder can be got by 2 half adders and one OR gate; one half adder can be got by XOR, AND. Therefore, we need only OR, AND, XOR. All these three gates can be got by using MUX.? Can be implemented using 8 Muxes. Show More Responses Just need 6 2-to-1 mux. First draw the truth table and try to implement using two 4-to-1 mux, AB as select and Cin/~Cin as input. It should be quite easy. Then break each 4-to-1 mux to three 2-to-1 mux. sum = a xor b xor cin carry = (a xor b) cin + ab You can easiy make XOR, OR AND, NOT using 2:1 mux. So 8 mux ?!? For a full adder, both the SUM and Cout are probably needed, so you need 7 2:1 muxes for each of them, hence 14 muxes. The simplest solution would be a LUT (look up table) in my opinion. If you need to implement gates, then potentially more muxes are needed. if Mux(I1,I2,S) is a 2x1 mux module, then Sum = Mux( (Mux(C,C',B), Mux(C',C,B), A) which requires 3 2x1 mux. Carry = Mux( (Mux(0,C,B), Mux(C,1,B), A) which requires 3 2x1 mux. |

### ASIC Design Engineer at NVIDIA was asked...

design a combinational circuit which counts the number of 1s in a 7-bit input . 7 Answersdepending on resource and timing constraints, you can use a cascade of adders, where you repeatdly add each bit starting with bit 7 to each other. this is slow because the critical path is on the Cin -> Cout. to improve, you can go further and use a 7bit decoder/any arrangement of decoder/column muxer + decoder for a lookup, which is essentially an SRAM array design to store this value so that next time you try to do this computation, you can directly access it. Essentially caching computation result. This requires extra circuitry overhead, but means you only have to compute the sums once. It can be done by two ways Clocked Circuit: Use a one bit adder and a register. Output of the register acts as 2nd input to the adder. Half adder can also be used. Combinational Circuit: We can do it in 4 full adders. For the adders A and B, let 6 bits be the inputs. For adders C, use the 7th bit and Carry of adder A and B. Adder C gives sum as bit 0. For adder D, use carry from all three adders. The sum is bit 1 and carry is bit 2. @Akash, Adder C should be the 7th bit and the sum of adder a and b right? not carry Show More Responses 4 full adder is a waste, you can do some optimization to achieve the goal by using two full adder. Of course, you need to use some XOR gates, inverter, AND gates, and MUXes input [6:0]; output sum; always@(input) begin sum = 0; sum = input[0]+input[1]+input[2]+input[3]+input[4]+input[5]+input[6]; end Lets verilog do the synthesis , this is a vaild combinational circuit design!!! @Akash: the question clearly states COMBINATIONAL circuit. A Clocked circuit is almost always sequential For this question you need 2 1-bit full-adder and 1 2-bits full adder. Let's call 7 bits as b0-b6. b0-b2 should be the input of 1-bit FA1 and b3-b5 for FA2. The output from FA1 and FA2 become 2-bits input of FA3 and b6 be the carry-in. |

### ASIC Design Engineer at NVIDIA was asked...

Design a divide by 3 counter. Bonus for 50% duty cycle 4 AnswersTake 3 dual edge flip flops and create a Johnson's counter. The output will be divide by 3 with exactly 50% duty cycle is dual edge flip flop a kind of cheating? I think this is the fastest way to do it. http://www.onsemi.com/pub_link/Collateral/AND8001-D.PDF Of course, I would not have been able to just innovate this in the blue in the middle of an interview tough... Show More Responses reg [2:0] counter1 ; reg [2:0] counter2 ; always @ (posedge clk_in) if (enable == 1'b0) counter1 <= 3'b001; else counter1 <= {counter1[1:0],counter1[2]}; always @ (negedge clk_in) if (enable == 1'b0) counter2 <= 3'b001; else counter2 <= {counter2[1:0],counter2[2]}; assign clk_out = (counter1[0]|counter2[0]); |

which is hard to fix -- setup violation or hold violation? And why? 4 AnswersI answered setup, then they asked again how to fix the setup violation. Should be the hold time violation. Setup time violation can be solved by increasing the time period; however, hold time violation should be solved by inserting delay into the timing path carefully I think it depends on which stage the problem is detected. If after tape out, it's absolutely hold time because you cannot easily change the logic of chip. But if it is still in RTL coding stage, hold time may be easily fixed by adding some buffers. Show More Responses At any stage of the design, it is the setup time violation that is difficult to fix. Hold time violations can be fixed easily by adding buffers and this is also automatically done by the software. Options to fix setup time violation are: 1. Play with the positioning of the gates to reduce the routing in between the gates. This reduces the parasitic capacitance encountered by the path and speeds it up. It is pure place and route exercise and is the most preferred solution. 2. Upsize the registers at the input stage to the offending path. This adds silicon area to the initial stages of the offending path and results in additional power consumption, but is an easy fix. 3. Re-design the logic between register stages. This is more easily done during the earlier stages of the design, but is a clean solution. 4. Nand-ize the logic path. A nand gate is faster and a better driver than a nor gate. This can result in more gates changing in the path, hence more floorplan and routing changes, but actually could result in less buffer stages, hence silicon area and less power consumption. 5. Increase the time period of the clock for that offending path by adding delay to the clock of the following register stage. This will affect the timing of the following register stage, affect the routing of the clocks, etc. Since this affects more signals and paths, this is less preferred. 6. Remove a register stage, in effect combining the data path with its following stage. 7. Increase the clock time period for the larger block or the entire chip. |

### ASIC Design Engineer at NVIDIA was asked...

Complete the C function (body) that uses recursion to determine if the string is a palindrome 4 Answersint isPalin(char *str){ int l = strlen(str); return isPalinHelper(str,0,l-1); } int isPalinHelper(char *str,int i,int j){ if(i #include #include int palindrome(const char* head,const char* tail) { int val; if (head >= tail) return 1; return (palindrome(head+1,tail-1) && (*tail == *head)); } int check_palindrome (const char* str) { const char* tail = str + strlen(str) - 1; const char* head = str; return palindrome(head,tail); } int main(int argc, void* argv[] ) { if (check_palindrome (argv[1])) printf("true\n"); else printf("false\n"); } At my previous solution, please ignore the local variable int val at the palindrome function Show More Responses bool palindrome(char* str, int len) { if(len<=1) return true; if(len == 2) if(str[0] == str[1]) return true; return (str[0] == str[len-1]) && palindrome(str+1, len-2); } |

frequency divide by 3 clk circuit 2 Answersits on the net u can google it out Suppose N=3; duty not 50%: you can use a ring counter, or a Moore FSM; duty 50%, first build a counter 0->2; then generate two enable signals, one active at time n=0, the other active at n=(N+1)/2; apply the two enable signals to two T-FF, the fiest one triggered on posedge, the second one triggered on negedge; the xor the T-FF outputs |

design state machine to test 10110101... how many FF will be used 3 AnswersFirst design the FSM and get the number of state variables. The number of FFs is the number of state variables (each output depends on the current state (value) of its FF) 4 I think one FF for state vector is enough, other parts are logic gate to decide next state according to input, and logic gate for output according to state vector |

There are 8 bits inputs ,only use full adder to detect how many logic 1's 2 Answersfirst think of how many bits do you need to detect the number of logic 1s in an 8 bit input. highest number will be 8. so you need 4 bits to represent that. how can you compute this value though? the optimal way i think to do this problem is to look at the properties of a full adder. there are 3 inputs (A, B, and Cin) and 2 outputs (S and Cout). You can hook each input of a full adder to a bit value. Therefore, what you end up having is 3 full adders FA0 to compute b0(A) + b1(B) + b2(Cin), FA1 to compute b3(A) + b4(B) + b5(Cin), and FA2 to compute b6(A) + b7(B) + 0 (we dont have a 9th bit). Each FA therefore will produce a 2 bit added sum S1, S2, and S3. Now we need to add S1 and S2 together with 2 FAs, which is pretty straight forward, and get S12. Then we have to add S3 to S12 using 3 FAs because a 3 bit number + a 2 bit number can equal a 4 bit number. We therefore use 7 FAs. Usually, the question is calculate the number of 1s in a 7bit number, which actually reduces the number of FAs to 4. we keep S1 and S2, but don;t need FA3, because we can use bit7 as a Carry in for our computation. Are you not using a total of 8 FAs with your approach here? 3 + 2 + 3 = 8 |

### ASIC Design Engineer at NVIDIA was asked...

is there any benefit to use cache if there is read miss for every access? 4 Answersno benefit you should probably think a little more about this problem before you just say there's no benefit, thought in most cases I do agree with u. you haven't said anything about write misses, and even though the delay contributed by them isn't as much as a read miss, i'd still mention it as a plus for having a cache, especially in write-back caches where you could potentially have a trace of just writes to locations brought in by your read misses, which means you get n number of cache write hits. Also make sure to mention ways to improve cache hit percentage by either increasing the cache size, changing the associativity of the cache, or by changing the compiler to optimize for the cache type (if we're talkin about an I-cache). Is pattern is such that there would be miss on every access, then there would no benefit of having a cache for both read misses and write misses. If it's a read miss for a block, then there would also be a write miss for that same block. Could you be more specific as to how it would benefit to have cache in such a scenario? Show More Responses The question doesn't say anything about writes. So even if every read is a miss, the cache will help in processor performance by providing write hits. It is very common that we read and update the same variable. For instance a++ or any operation of this sort. So cache is beneficial. |

### ASIC Design Engineer at Broadcom was asked...

how to generate a clock divide by 3 4 AnswersAssuming that input clock is square wave and 50% duty cycle, Method 1: Clk / 3 is equal to Clk / (6/2). this means fist divide by 6 and multiply by 2. dividing by 6( use two DFF(D is tied to Q_b and it is connected to clock of 2nd DFF). multiplying by 2( create some delay and XOR the two signals(the signal after dividing by 6 and its delayed signal), But it's hard to make the output clock have 50% duty cycle due to precise delay control. so alternative method 1 is first multiply by 2 and divide by 6. Method 2. Use 2 edge counters(one for rising edge and the other for falling edge) draw the state machine that goes (back) to toggle state when both counters become 2. Sorry, Method 1 is incorrect. the possible method 1 is to delay the input clock and XOR the input clock and its delayed one(delay doesn't need to be precisely half period of input clock, which is good) then use a single rising edge counter to toggle when it counts 3 rising edges. this is glitch-free Please refer to: http://www.eetimes.com/document.asp?doc_id=1202359 Show More Responses Sorry above is wrong link. correct link: http://vlsiwizard.blogspot.com/2008/01/design-clock-divide-by-3-circuit-with.html |

**1**–

**10**of

**138**Interview Questions

## See Interview Questions for Similar Jobs

- Hardware Engineer
- Software Engineer
- Design Engineer
- Verification Engineer
- Physical Design Engineer
- ASIC Engineer
- Senior ASIC Design Engineer
- ASIC Verification Engineer
- Intern
- Digital Design Engineer
- Design Verification Engineer
- Component Design Engineer
- Senior Software Engineer
- Engineer
- Senior Hardware Engineer
- Engineering
- Architect
- Principal Engineer
- Senior Design Engineer