

# UNIVERSIDADE FEDERAL DE SANTA CATARINA CENTRO TECNOLÓGICO CURSO DE GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO

Bernardo Borges Sandoval

Quasar: A Radiation Robustness Evaluation Tool for Electronic Circuits

Florianópolis 2024 Bernardo Borges Sandoval

Quasar: A Radiation Robustness Evaluation Tool for Electronic Circuits

Trabalho de Conclusão de Curso de Graduação em Ciência da Computação da Universidade Federal de Santa Catarina. Orientadora: Prof. Cristina Meinhardt, Dra. Coorientador: Prof. Rafael Budim Schvittz, Dr. Ficha catalográfica gerada por meio de sistema automatizado gerenciado pela BU/UFSC. Dados inseridos pelo próprio autor.

> Sandoval, Bernardo Borges Quasar: A Radiation Robustness Evaluation Tool for Electronic Circuits / Bernardo Borges Sandoval ; orientadora, Cristina Meinhardt, coorientador, Rafael Budim Schvittz, 2024. 62 p.

Trabalho de Conclusão de Curso (graduação) -Universidade Federal de Santa Catarina, Centro Tecnológico, Graduação em Ciências da Computação, Florianópolis, 2024.

Inclui referências.

1. Ciências da Computação. 2. Efeitos de Variabilidade. 3. Single Event Transiet. 4. Ferramenta de EDA. I. Meinhardt, Cristina . II. Schvittz, Rafael Budim. III. Universidade Federal de Santa Catarina. Graduação em Ciências da Computação. IV. Título. Bernardo Borges Sandoval

## Quasar: A Radiation Robustness Evaluation Tool for Electronic Circuits

O presente trabalho em nível de Bacharel foi avaliado e aprovado por banca examinadora composta pelos seguintes membros:

> Prof<sup>a</sup>. Cristina Meinhardt, Dra. Universidade Federal de Santa Catarina

Prof. Rafael Budim Schvittz, Dr. Universidade Federal de Rio Grande

Prof. José Luís Almada Güntzel, Dr. Universidade Federal de Santa Catarina

Ingrid Fortes Vasconcelos Oliveira , Dra. Embraer

Certificamos que esta é a **versão original e final** do trabalho de conclusão que foi julgado adequado para obtenção do título de Bacharel em Ciências da Computação.

Coordenação do Curso de Graduação em Ciências da Computação

Prof. Cristina Meinhardt, Dra. Orientadora

Florianópolis, 2024.

Este trabalho é dedicado a todos que acreditaram em mim.

### ACKNOWLEDGEMENTS

Primeiramente, gostaria de agradecer a todos que me apoiaram para chegar até aqui. Em especial, à minha mãe, que sempre me incentivou a dar o meu melhor; à Paula Borges Monteiro, minha primeira orientadora, que iniciou minha carreira na ciência e me encorajou a seguir na pesquisa; e à Cristina Meinhardt, minha atual orientadora, que me acompanhou durante toda a graduação e me proporcionou oportunidades para ir muito mais longe do que imaginei ser possível. Sem elas, eu certamente não estaria onde estou hoje.

Agradeço a todos os professores que me apoiaram ao longo do caminho, em especial ao meu coorientador Rafael Budim Schvittz, que acompanhou minha pesquisa de cabo a rabo, e a Fernando Santana Pacheco, que me apresentou à programação e me ensinou a gostar dela. Sou grato a cada um que teve paciência para tirar minhas dúvidas, despertou meu interesse pelos mais diversos tópicos e serviu de exemplo a se seguir. Ter tido a educação que tive foi um verdadeiro privilégio.

Por fim, agradeço a todos os meus colegas: àqueles que, mesmo seguindo rumos diferentes, continuaram a me apoiar, acreditar em mim e acompanhar meu crescimento desde cedo; àqueles que trilharam grande parte do caminho comigo e em quem pude me apoiar quando precisei; e aos muitos que conheci pelo caminho, em especial os que tive o prazer de conviver com em nosso laboratório.

"O tempo passa rápido quando a gente se diverte" (Um amigo do André)

#### RESUMO

À medida que o design de circuitos avança e o número de transistores em um chip atinge bilhões, a demanda por ferramentas para auxiliar o design desses circuitos também aumenta. Embora a redução do tamanho dos transistores traga muitos benefícios, ela também torna a tecnologia mais sensível a alguns tipos de falhas e aumenta o impacto de pequenas variações no processo de fabricação. Falhas de radiação têm sido uma preocupação crescente nas últimas décadas, tornando-se um problema não apenas para aplicações aeroespaciais, mas também em nível terrestre. Nesse contexto, este trabalho apresenta o Quasar, uma ferramenta de código aberto desenvolvida para aprimorar a avaliação dos efeitos da variabilidade na sensibilidade à radiação em nível elétrico. Quasar recebe como entrada uma *netlist* descrevendo o circuito e determina automaticamente métricas de robustez, como o *Linear Energy Transfer* crítico, para cada configuração em que uma falha do tipo Single Event Transient pode propagar um erro. A ferramenta pode lidar com desde portas lógicas únicas até circuitos de tamanho médio com múltiplas portas em poucos segundos, acelerando o mecanismo tradicional de injeção de falhas baseado em um grande número de simulações elétricas. A ferramenta não é acoplada a um único simulador elétrico, modelo de transistor ou parâmetro de variabilidade, permitindo uma alta versatilidade na análise de circuitos. O fluxo de trabalho da ferramenta explora o mascaramento lógico para reduzir a exploração do espaço de design, ou seja, para diminuir o número necessário de simulações elétricas. O paralelismo também é usado para acelerar a avaliação em nível de circuito. O Quasar já demonstrou potencial para fornecer resultados úteis. Neste trabalho, três aplicações do Quasar são apresentadas e discutidas. A primeira é uma avaliação de mapeamento de portas para uma função lógica, mostrando que a sensibilidade à radiação de um circuito pode ser aproximada pela robustez de sua porta de saída mais sensível. A análise também mostra como a variabilidade pode influenciar significativamente a confiabilidade, especialmente o fato de que a hierarquia de sensibilidade entre as portas NOR2 e NAND2 é altamente dependente da flutuação da Função Trabalho em dispositivos *FinFET*. Na segunda parte, um estudo de caso de rede restauradora explora o razão por trás das diferentes respostas dos circuitos à variabilidade, além de como estimar de maneira analítica configurações críticas do circuito. Na terceira parte, os resultados fornecidos pelo Quasar são comparados com uma ferramenta similar. Enfim, mostramos que os objetivos deste trabalho foram contemplados, a ferramenta desenvolvida é capaz de eficientemente avaliar a robustez de radiação de um circuito levando em conta a variabilidade de processo.

Palavras-chave: Efeitos de Variabilidade, Single Event Transiet, ferramenta de EDA.

## ABSTRACT

As circuit design advances, and the number of transistors on a chip reaches billions, the demand for tools to help the design of these circuits follows. Although reducing the size of transistors brings many benefits, it also makes the technology more sensitive to some types of faults, as well as to the impact of small variations in the manufacturing process. Radiation faults have grown in concern in the past decades, becoming a problem not only for aerospacial applications but also at ground level. In this light, this work presents Quasar, an open source tool developed to boost the evaluation of the variability effects on radiation sensitivity in detail at an electrical level. Quasar receives as input a netlist describing the circuit, and automatically determines robustness metrics such as the critical Linear Energy Transfer for every configuration a Single Event Transient fault can propagate an error. The tool can handle from small basic cells to median multi gate circuits in few seconds, speeding up the traditional fault injection mechanism based on large number of electrical simulations. It is not coupled to a single electrical simulator, transistor model or variability parameter, allowing for a high versatility in circuit analysis. The tool's workflow explores logical masking to reduce the design space exploration, i.e., to reduce the necessary number of electrical simulations. Parallelism is also used to speed up the circuit level evaluation. Quasar already has shown the potential to provide useful results, in this work three applications of Quasar are presented and discussed. The first is a gate mapping evaluation showing that a circuit radiation sensitivity can be approximated by the robustness of its most sensitive output gate. It also shows how process variability can significantly influence reliability, especially the fact that the sensitivity hierarchy between the NOR2 and NAND2 gate is highly dependent on the Work Function Fluctuation in metal-gate devices. In the second part, a restoring network case study explores the reasoning behind the different responses of the evaluated circuits to variability, as well as how to estimate critical circuit configurations in an analytical way. In the third the results provided by Quasar are compared with a similar tool. In summary, we have shown that the objectives of this work were achieved. The developed tool is capable of efficiently evaluating the radiation robustness of a circuit while considering process variability.

Keywords: Variability Effects, Single Event Transient, EDA tool.

# LIST OF FIGURES

| Figure 1 – Silicon Lattices                                                                        | 17 |
|----------------------------------------------------------------------------------------------------|----|
| Figure 2 – PN Junction                                                                             | 18 |
| Figure 3 – N and P type MOSFET Crossection Respectivelly                                           | 19 |
| Figure 4 – N and P type MOSFET with charged gate.                                                  | 19 |
| Figure 5 – Double gate FinFET transistor                                                           | 20 |
| Figure 6 – Metal Gate Granularity Representation.                                                  | 22 |
| Figure 7 – Single Event Transient (SET) Fault Generation Phases                                    | 23 |
| Figure 8 – SET Generated Current                                                                   | 24 |
| Figure 9 – Fault Model. $\ldots$                                                                   | 24 |
| Figure 10 – Logical Masking.                                                                       | 25 |
| Figure 11 – Electrical Masking                                                                     | 26 |
| Figure 12 – Temporal Masking                                                                       | 26 |
| Figure 13 – No Masking                                                                             | 27 |
| Figure 14 – Quasar Layer Model                                                                     | 32 |
| Figure 15 – Critical LET Search Layer                                                              | 34 |
| Figure 16 – Output Voltage Response to Current Pulse Increase.                                     | 35 |
| Figure 17 $-$ Output Voltage Response to Current Pulse Increase Transition Zone. $% \mathcal{A}$ . | 35 |
| Figure 18 – NAND2 Gate Transistor Topology and Graph Model                                         | 38 |
| Figure 19 – Variability Evaluation Layer                                                           | 38 |
| Figure 20 – Mirror Full Adder Graph Model.                                                         | 39 |
| Figure 21 – Mirror Full Adder Electrical Path Groups Graph                                         | 39 |
| Figure 22 – Parallelization Framework                                                              | 41 |
| Figure 23 – Variability Evaluation Layer                                                           | 43 |
| Figure 24 – C17 Topologies                                                                         | 45 |
| Figure 25 – NAND2 and NOR2 $LET_{th}$ Dispersion                                                   | 46 |
| Figure 26 – The difference between NAND2 and NOR2 $\text{LET}_{th}$ Dispersion                     | 46 |
| Figure 27 – C17_Mixed Critical Node                                                                | 47 |
| Figure 28 – NOR2 and Inverter Gates                                                                | 47 |
| Figure 29 – Average $LET_{th}$ of the Analyzed Gates                                               | 48 |
|                                                                                                    |    |

# LIST OF TABLES

| Table 2 | _ | 32nm PTM Main Parameters                                             | 19 |
|---------|---|----------------------------------------------------------------------|----|
| Table 3 | _ | 7nm FinFET ASAP7 Main Parameters                                     | 20 |
| Table 4 | _ | Related Work.                                                        | 31 |
| Table 5 | _ | Number of Simulations to find determine Critical LET                 | 36 |
| Table 6 | _ | Circuit Level Evaluation Execution Time                              | 42 |
| Table 7 | _ | $LET_{th}$ distribution in MeV.cm <sup>2</sup> /mg                   | 44 |
| Table 8 | _ | Critical Charge (fC) for NAND3 in different technologies and method- |    |
|         |   | ologies                                                              | 49 |

# LIST OF ABBREVIATIONS AND ACRONYMS

| AI     | Artificial Intelligence                                |
|--------|--------------------------------------------------------|
| CMOS   | Complementary metal-oxide-semiconductor                |
| EDA    | Electronic Design Automation                           |
| EPG    | Electrical Path Group                                  |
| FinFET | Fin Field-Effect Transistor                            |
| FPGA   | Field-Programmable Gate Arrays                         |
| IC     | Integrated Circuit                                     |
| IHP    | Leibniz Institute for High Performance Microeletronics |
| LET    | Linear Energy Transfer                                 |
| LUT    | LookUp Tables                                          |
| MC     | Monte Carlo                                            |
| MDG    | Multiway Decision Graphs                               |
| MGG    | Metal Gate Granularity                                 |
| ML     | Machine Learning                                       |
| MOSFET | Metal–Oxide–Semiconductor Field-Effect Transistor      |
| PDK    | Process Design Kit                                     |
| PTM    | Predictive Transistor Model                            |
| RDD    | Random Discrete Doping                                 |
| RTL    | Register-Transistor Level                              |
| SEAT   | Soft Error Analysis Toolset                            |
| SEE    | Single Event Effects                                   |
| SER    | Soft Error Rate                                        |
| SET    | Single Event Transient                                 |
| TCAD   | Technology Computer Aided Design                       |
| WF     | Work Function                                          |

# CONTENTS

| 1     |                                              | 14 |
|-------|----------------------------------------------|----|
| 1.1   | OBJECTIVES                                   | 15 |
| 1.2   | TEXT ORGANIZATION                            | 15 |
| 2     | BACKGROUND                                   | 17 |
| 2.1   | MOSFET DEVICES                               | 17 |
| 2.2   | FINFET DEVICES                               | 20 |
| 2.3   | PROCESS VARIABILITY                          | 21 |
| 2.3.1 | Random Discrete Doping (RDD)                 | 21 |
| 2.3.2 | Metal Gate Granularity (MGG)                 | 21 |
| 2.4   | RADIATION FAULT GENERATION AND PROPAGATION   | 22 |
| 2.5   | SET SIMULATION                               | 24 |
| 2.6   | SET MASKING EFFECTS                          | 25 |
| 2.6.1 | Logical Masking                              | 25 |
| 2.6.2 | Electrical Masking                           | 26 |
| 2.6.3 | Temporal Masking                             | 26 |
| 3     | RELATED WORK                                 | 28 |
| 3.1   | SOFT ERROR ANALYSIS TOOLSET (SEAT)           | 28 |
| 3.2   | IHP METHODOLOGY                              | 29 |
| 3.3   | POLYTECHNIQUE MONTREAL RTL METHODOLOGY       | 29 |
| 3.4   | AGUIAR'S TOOL                                | 30 |
| 3.5   | OVERVIEW                                     | 30 |
| 4     | TOOL PROPOSAL AND DEVELOPMENT                | 32 |
| 4.1   | LAYER 0 - SET MODELING                       | 33 |
| 4.2   | LAYER 1 - CRITICAL LET SEARCH                | 33 |
| 4.2.1 | Electrical Validation                        | 33 |
| 4.2.2 | Critical Linear Energy Transfer (LET) Search | 34 |
| 4.3   | LAYER 2 - CIRCUIT LEVEL EVALUATION           | 37 |
| 4.3.1 | Logical Validation                           | 37 |
| 4.3.2 | Parallelization Framework                    | 40 |
| 4.4   | LAYER 3 - VARIABILITY EVALUATION             | 42 |
| 5     | RESULTS                                      | 44 |
| 5.1   | GATE MAPPING CASE STUDY                      | 44 |
| 5.2   | RESTORING NETWORK CASE STUDY                 | 47 |
| 5.3   | IHP METHODOLOGY COMPARISON                   | 49 |
| 6     | CONCLUSION                                   | 51 |
| 6.1   | PUBLICATIONS                                 | 52 |
| 6.2   | SOURCE CODE                                  | 53 |

| REFERENCE | S                                        | 54 |
|-----------|------------------------------------------|----|
| ANNEX A   | PAPER PUBLISHED IN THE PROCEEDINGS OF    |    |
|           | THE 2024 IEEE 31ST INTERNATIONAL CONFER- |    |
|           | ENCE ON ELECTRONICS CIRCUITS AND SYS-    |    |
|           | TEMS (ICECS)                             | 58 |

## **1 INTRODUCTION**

Circuit design techniques changed a lot over the last decades. Due to technology trends following Moore's Law, the number of transistor devices in a chip has grown exponentially (KAHNG, 2010). Designing a circuit is a very complex process that involves not only the choice of how devices are connected and how they are physically placed, but also timing aspects, reliability issues, robustness, and power-efficiency. It would be unfeasible to make these choices individually and manually for the billions of transistors in modern chips. For that purpose, Electronic Design Automation (EDA) tools were developed to automatize parts of the Integrated Circuit (IC) design and help on the convergence of these processes to align the design requirements and the technology challenges.

The advancements in microelectronics bring many benefits, as the computer's processing speed increases while transistor size decreases. Technologies such as Artificial Intelligence (AI), which require high processing power and large databases, become viable. These advances, however, come with certain caveats. Clock speed increases allowing for faster processing, and voltage supply drops decreasing power consumption per device, but both these effects also make circuits more sensitive to faults such as those induced by radiation (BAUMANN, 2005).

Radiation faults were first observed during the 80s (FERLET-CAVROIS; MAS-SENGILL; GOUKER, 2013). At that moment, these concerns were primarily relevant for aerospacial applications, as systems operating at high altitudes or beyond the atmosphere lack the protection of the ozone layer and Earth's magnetic field. Particles reaching ground level not having enough energy to pose a significant concern for circuit reliability. However, since the late 90s, the technology evolution allowed the design of circuits reaching a speed, voltage supplies, and parasitic capacitances in which the energy necessary to cause a fault is much lower than decades before. Consequently, energetic particles that reach ground level can cause faults more frequently.

Furthermore, the decreased size of transistors introduces another reliability issue. The manufacture process involves small variability sources, which can significantly impact the behavior of devices operating on a nanometric scale, mainly with the technology reaching very close to an atomic level (KAHNG, 2010). Therefore, for a range of applications, it is important that during the design flow, EDA tools can provide a way to evaluate the circuit robustness to radiation faults. Moreover, it is also important that variability is taken into consideration together in the reliability evaluation.

Some works have explored tools and methodologies to evaluate radiation robustness of circuits in different levels of abstraction (ANDJELKOVIC; KRSTIC, 2024b; HAMAD; MOHAMED; SAVARIA, 2016; AZIMI et al., 2018; V; MITTAL; KUMAR, 2023; PENG et al., 2019; AGUIAR et al., 2016; RAJARAMAN et al., 2006; LI; DRAPER, 2016; HAMAD; HASAN, et al., 2014a). Some approaches include the use of Technology Computer Aided Design (TCAD) simulations to more precisely simulate a transistor's response to the particle collision, integrating this response on electrical simulations to more accurately model radiation effects (ANDJELKOVIC; KRSTIC, 2024b; V; MITTAL; KUMAR, 2023; PENG et al., 2019). This allows for a greater precision in radiation response but is very time consuming, needing auxiliary speedup techniques to achieve a reasonable time.

Other works seek to model radiation effects at Register-Transistor Level (RTL) level (HAMAD; MOHAMED; SAVARIA, 2016; AZIMI et al., 2018). This allows for a greater scope of circuit size but compromises the precision of the evaluation of the radiation faults propagation and generation, as most part of the electrical response of the circuit is abstracted.

Finally, some works simulate radiation at an electrical level (AGUIAR et al., 2016; ANDJELKOVIC; KRSTIC, 2024b; LI; DRAPER, 2016). This is the abstraction level chosen for this work, as it presents a desirable compromise between precision and execution time. Although similar approaches to this work have been previously explored (AGUIAR et al., 2016), most fail to consider variability into the robustness analysis.

## 1.1 OBJECTIVES

The main objective of this work is to propose Quasar, a tool to allow the evaluation of radiation robustness of multi-gate circuits at electrical level considering process variability. The tool explores logical modeling and parallelization to significantly speed up simulations.

The goal of this work also includes to:

- Propose and implement an Open Source Radiation Evaluation EDA Tool;
- Boost the tool performance with parallelization and prediction techniques;
- Enable the tool to evaluate different circuit technologies and to be integrated with multiple electrical simulators;
- Provide a mechanism to easily evaluate the variability effects on radiation sensitivity, considering different circuits and technologies.
- Allow the analysis of various mitigation approaches at the electrical level.

Moreover, this work aims to provide useful results demonstrating the tool usability.

#### 1.2 TEXT ORGANIZATION

The remaining of this document is organized as follows: Chapter 2 presents the fundamental concepts behind transistor devices and radiation faults needed to understand the rest of this work. Chapter 3 presents an overview of various similar works that study

radiation faults on electrical circuits. An in depth comparison between some of the tool and quasar is done. At the end a compiled table shows a high level comparison between all works and Quasar, highlighting their differences. Chapter 4 describes the implemented tool in detail as well as the main techniques used to increase performance. Chapter 5 provides an analysis of the impact of circuit topology in radiation robustness of which all results were obtained by the use of Quasar. Finally, Chapter 6 wraps up the main topics of this work.

### 2 BACKGROUND

This Chapter presents an overview of how Metal–Oxide–Semiconductor Field-Effect Transistor (MOSFET) transistor work and their pertinent properties for the study of radiation. It also explains the main physical mechanism behind radiation faults, as well as how to model simulated them.

## 2.1 MOSFET DEVICES

Transistors devices are fabricated from a silicon crystal body (WESTE; HARRIS, 2015). Silicon needs four covalent extra electrons to achieve stability. In transistors, this is obtained by all silicon making covalent bonds with other silicon, forming a lattice as illustrated in Figure 1a. This lattices can be doped with other elements. For instance, arsenic has five valence electrons but still bonds with four silicon maintaining the lattice structure. The fifth electron does not become part of any bond and is free on the lattice structure leaving the arsenic with positive charge, this is forms a N type Lattice illustrated in Figure 1b. Similarly when boron or other element with three valence electrons is used, it bonds with four other silicon but leaves a hole, the missing electron, behind. The Boron becomes negative charge and the hole becomes "free" on the silicon body making. This forms a P type lattice, illustrated in Figure 1c.



When a P type silicon body is in contact with a N type a diffusion of the free electrons present in the N body occur in the direction of the P body. These electrons fill the holes present on the N body nullifying both charge carriers, creating the PN junction illustrated in Figure 2 and the depletion zone, the zone with no free electrons or holes. The diffusion of electrons is known as the diffusion current, and although the concentration of electrons on the body is not fully balanced this current eventually stops. Although there are free electrons or holes, both P and N type bodies are electrically neutral, as they have the same number of protons and electrons. This stops being true in the depletion zone due to the drift current, the P body becomes electrically negative as the electrons from the N

body come to fill the holes, to inverse is true for the N body. This charge imbalance creates and electrical field that opposes the diffusion current (RABAEY, 2003). The presence of this electrical field will be crucial to understanding how radiation particles impact circuit behavior.

| p | - +<br>- +<br>- +<br>- +<br>- +<br>- + | n |  |
|---|----------------------------------------|---|--|
|---|----------------------------------------|---|--|

Source: Adapted from (RABAEY, 2003)

The PN junction is the basis of all semi conductor. Current only will pass the junction in the direction going from the P region to the N, as the electrical field that originates from the tensions source will oppose the one present in the junction and if strong enough will allow current to pass. If the new field was in the same direction of the junction field the depletion zone would only increase, and not allow any current to pass.

There are two types of MOSFETs regarding their electrical behavior. Figure 3 shows the P Type (PMOS) and N Type (NMOS) MOSFETs. Both devices have a body of one of the lattice types and two wells connected to source and drain terminals of the other type. In this configuration it is not possible to pass current from the source to drain through the transistor, as the electrical field created by the voltage difference will not pose both junction fields, as they are in opposite directions.

The current flow is obtained when the gate is electrically charged with the same charge of the free carriers present in the body lattice, negative for the N type and positive for the P type. This charge will repel this carriers, essentially expanding the depletion zone and allowing the carriers of wells to fill this zone, making it a single zone, represented in Figure 4. This change allows for current to flow from source to drain as there will be no PN junction and in turn no electrical field in the currents path.

Complementary metal-oxide-semiconductor (CMOS) architecture uses a complementary net of PMOS and NMOS to express logic. PMOS transistor are better conducting high logical levels, so they compose the pull up network. In the same way NMOS transistors conduct low level logic better and compose the pull down network.

With the technology scaling, the dimensions of the gate and diffusion areas is reducing, reaching nanometer scales. This brings new challenges due to the increase of the short-channel effects, as increase on the leakage currents.

To allow realistic electrical simulations, the industry provide electrical models of the devices in the PDK (Process Design Kit). An initiative to ensure the advance of technology, mainly on the research community with independence of industry data, or to deal with the limited access to industry data, is the development of predictive models.



Figure 3 – N and P type MOSFET Crossection Respectivelly.

Source: (WESTE; HARRIS, 2015)

Figure 4 – N and P type MOSFET with charged gate.



Source: Adapted from (WESTE; HARRIS, 2015)

Predictive models in general are built with some data from industry and academy, to distribute accurate models to new devices nodes or new technologies.

In the experiments presented later is this work the 32 nm Predictive Transistor Model (PTM) Planar MOSFET (ZHAO; CAO, 2006) will be used, Table 2 shows its relevant parameters.

| Parameter                       | $7\mathrm{nm}$                   |         |  |  |
|---------------------------------|----------------------------------|---------|--|--|
| Supply Voltage                  |                                  | 0.9 V   |  |  |
|                                 | 1000 nm                          |         |  |  |
| ${\bf Min \ Width \ (W_{min})}$ | 32  nm                           |         |  |  |
| Oxide Thickness (Tox)           | 3 nm                             |         |  |  |
| Channel Doping                  | $1.7 \ge 10^{23} \text{ m}^{-3}$ |         |  |  |
| Source/Drain Doping             | $1 \ge 10^{26} \text{ m}^{-3}$   |         |  |  |
| Threshold Voltage (V.,)         | 0.49 V                           |         |  |  |
| voltage (v <sub>th-sat</sub> )  | PMOS                             | -0.49 V |  |  |
|                                 |                                  |         |  |  |

Source: (ZHAO; CAO, 2006)

#### 2.2 FINFET DEVICES

The scaling down of transistors devices presents geometric challenges. Traditional planar MOSFET devices face electrical hindrances when scaled down past the 10nm mark. Short-channel effects, mobility degradation and leakage currents become major hindrances (YU et al., 2002). Multi-gate devices such as the Fin Field-Effect Transistor (FinFET) present a better performance under 10nm than their planar counterpart and are relatively easy to fabricate. Figure 5 presents a diagram of a FinFET device. Its main feature is the 3D structure, in which the source/drain channel is elevated creating a fin like structure and the gate passes over it. In contrast with the regular MOSFET device, the FinFET sizing is discrete, being sized in the number of fins used.

In this work we adopt the 7nm ASAP Process Design Kit (PDK) as the (CLARK et al., 2016) as the FinFET Model. Table 3 shows the main parameters for this target technology.

Figure 5 – Double gate FinFET transistor



Source: Irene Ringworm, CC BY-SA 3.0, via Wikimedia Commons.

| Parameter                                      | $7\mathrm{nm}$                 |         |  |  |
|------------------------------------------------|--------------------------------|---------|--|--|
| Supply Voltage                                 | $0.7 \mathrm{V}$               |         |  |  |
| $\fbox{Gate Length (L_g)}$                     | 21 nm                          |         |  |  |
| $\hline { \  \   Fin \ Width \ (W_{fin}) } \\$ | 6.5  nm                        |         |  |  |
| ${\bf Fin \ Height \ (H_{fin})}$               | 32  nm                         |         |  |  |
| Oxide Thickness (Tox)                          | 2.1 nm                         |         |  |  |
| Channel Doping                                 | $1 \ge 10^{22} \text{ m}^{-3}$ |         |  |  |
| Source/Drain Doping                            | $2 \ge 10^{26} \text{ m}^{-3}$ |         |  |  |
| Work Function (WF)                             | $4.3720   \mathrm{eV}$         |         |  |  |
| work Function (wr)                             | 4.8108 eV                      |         |  |  |
| Threshold Voltage (V.,)                        | 0.17 V                         |         |  |  |
|                                                | PFET                           | -0.16 V |  |  |
|                                                |                                |         |  |  |

Table 3 – 7nm FinFET ASAP7 Main Parameters.

Source: (CLARK et al., 2016)

#### 2.3 PROCESS VARIABILITY

The scaling of transistors can present challenges for radiation reliable systems. One of the main impacts scaling has that affects not only radiation robustness is process variability (KAHNG, 2010). Lithography, the fabrication process of microelectronic chips faces various limitations as devices scale down. During MOSFET fabrication light is used to precisely interact with the material, and for a while the smallest feature of the fabricated material was larger then the wavelength of the light used. However, this fact has changed, and nowadays the smallest feature of devices is significantly smaller than the wavelength of light used (ORSHANSKY; NASSIF; BONING, 2007). Sub-wavelength lithography introduces greater variability in modern circuits due to the inherent imprecision of fabricating a device smaller than the wavelength of light used in the fabrication.

Sub wavelength lithography makes the transistors subject to significant process variability. This impacts device behavior, such as on and off currents and, the main parameter this work is concerned with, Voltage Threshold ( $V_{th}$ ). This is highly impactful when considering radiation reliability as a lower  $V_{th}$  allows for a fault to propagate further throughout the circuit.

#### 2.3.1 Random Discrete Doping (RDD)

The PN junction and its electrical properties are one of the most, if not the most, essential elements to the operation of semiconductors. The junction creation relies on the doping of other elements on the silicon lattice, changing the body type to P or N. As transistor scaled down to the nanometric levels the number of atoms in a transistor channel has reached about 100 or less (SAHA, 2010) and decreases as transistor continue to scale down.

The number of doping elements is very small, to the point that they can be counted as discrete, and as its not possible to chose the exact number of doping atoms inserted the process becomes subject to variability. Random Discrete Doping (RDD), has a sizable impact in modern transistor as the absence or inclusion of a single atom accounts for a significant part of the channel doping. In planar CMOS devices it is one of the main determining factors of the  $V_{th}$  of the transistor channel.

#### 2.3.2 Metal Gate Granularity (MGG)

The silicon lattice that composes semiconductors is not always perfect. This crystalline structure can vary, with some instances of a device having a higher granularity than others. This factor is specially important in the silicon to metal interface on the gate of transistors devices, comprising the Metal Gate Granularity (MGG). Figure 6 presents a diagram of the interface and how the MGG of a real device differs from an ideal one. Its variation impacts the Work Function (WF) of the device, making so more or less energy is required to free an electron from the material allowing conductance, which highly impacts the  $V_{th}$  of a device (MEINHARDT; ZIMPECK; REIS, 2014).

In FinFETs and other multi-gate devices the MGG and thus the WF fluctuation is much more impactful than RDD as these devices have lowly duped channels (WANG et al., 2011; MUSTAFA; BHAT; BEIGH, 2013).



Figure 6 – Metal Gate Granularity Representation.

Source: (MEINHARDT; ZIMPECK; REIS, 2014)

## 2.4 RADIATION FAULT GENERATION AND PROPAGATION

Radiation particles are in abundance in our universe. Different types of particles present different effects when interacting with semiconductors due to charge, mass, speed and angle of impact differences. Single Event Effects (SEE) are types of faults caused by cosmic particle that are present even at low Earth orbits (DODD; MASSENGILL, 2003).

A SET is a type of SEE that happens when a charged particle passes through the PN junction of a transistor device. It is transient in the sense that there is no permanent physical effect on the affected circuit. The charge deposited by the particle in the transistor body generates a current pulse that might propagate throughout the circuit and generate an error (FERLET-CAVROIS; MASSENGILL; GOUKER, 2013). This section will discuss error taxonomy, the physical phenomena that causes a SET, how it can be electrically simulated and how it can generate an error and alter circuit behavior.

As the study of SETs is a study of robustness and system reliability, it is important to define the terminology used regarding error analysis (AVIZIENIS, 1982). A **fault** is an undesirable physical condition or disturbance in the system. In our case, it is the striking of the charged particle. An **error** is a manifestation of the fault, an alteration of the system behavior that might create an inconsistency on the data generated. When analyzing SETs, this is represented by the propagation of the fault to an memory element, which will alter the data stored. Finally a **failure** is a total deviation of a system specification which might happen due to an error. This work doesn't discuss failures. When a charged particle passes through the reversed biased PN junction of a transistor device, a SET fault is generated. The physical phenomena describing the fault can be divided in three main phases (BAUMANN, 2005) as presented in Figure 7. Firstly, through the passage of the charged particle the electrons in covalent bonds of the silicon crystal are ionized, as presented in Figure 7a. This creates new electron-hole pairs, charge carriers, that would usually recombine and no major effect would be felt. However, due to the electric field present at the PN junction, the carriers are quickly captured before recombining, generating a drift current as illustrated in Figure 7b. Finally, an ion diffusion happens because of the carrier unbalance created on the junction due to the drift current, as presented in Figure 7c. It promotes the balancing of the charges and restores the device back to normal behavior.





The generated current is proportional to the charged collected by the transistor during the particle passing. Figure 8 shows the shape of the generated current. Two distinct features can be observed: a high and ephemeral peak and a long plateau that, although not high in amplitude, persists for a relatively long time.

It is important to differentiate the two kind of current pulses modeled. A SET fault can happen both in PMOS and NMOS devices, and has a different behavior in each. When the particle strikes the sensitive part of a PMOS, if the node was in low logical state, it generates a low-to-high transition transition, characterizing a p-hit (DUAN; WANG; LAI, 2011). Likewise, if the particle strikes a NMOS while it is on high state it generates a high-to-low transition, a n-hit.



Figure 8 – SET Generated Current.

#### 2.5 SET SIMULATION

There are several ways to simulate a SET (ANDJELKOVIC, M. et al., 2017). In this work, the electrical level simulation is opted. Although not as precise as TCAD simulations, it can provide good results to circuits up to median size with better precision and accuracy than RTL simulations.

The most accessible and straight forward approach the simulate a SET at an electrical level is Messenger's model (MESSENGER, 1982), a macro-model based on single voltage independent current source (ANDJELKOVIC, M. et al., 2017). In this approach, a double exponential current source is injected in the node to represent the current pulse generated by a fault occurred in the devices connected to this particular node. This fault model is represented in Figure 9, illustrating the circuit under evaluation and the fault node insertion in the internal nodes of it.





Source: The author, 2024.

Equation 1 shows the relation between the injected current (I(t)) and collected charge (Qcoll). The collected charge timing constant  $(\tau_{\alpha})$  and timing constant to establish the ion track ( $\tau_{\beta}$ ) are configurable and defaulted to 164 ps and 50 ps, respectively, according to (CARRENO; CHOI; IYER, 1990). The charge collection depth (*L*) depends on technology parameters. Further, in a given simulation, all this parameters will be set, both the LET (*LET*) and critical charge become dependent only on the fault's current (*I*(*t*)).

$$Qcoll = I(t) \frac{\tau_{\alpha} - \tau_{\beta}}{e^{-\frac{t}{\tau_{\alpha}}} - e^{-\frac{t}{\tau_{\beta}}}}$$

$$LET = \frac{Qcoll}{10.8L}$$
(1)

#### 2.6 SET MASKING EFFECTS

For a SET fault to actually cause an error, the fault must propagate to an output interface of the circuit and have enough amplitude to invert the logical value of said output. This has to occur during the latch window of a memory element connected to this output to cause a bit flip and alter circuit behavior. Once the fault occurs there are three ways that it can be masked, i.e., not be latched by an memory element. Those are logical masking, electrical masking and temporal masking (SHIVAKUMAR et al., 2002).

#### 2.6.1 Logical Masking

If the collected charge of a SET fault is enough, the logical value on the struck node essentially flips. Logical masking occurs when this logical flip does not impact the output due to the logical mapping of the circuit. Figure 10 illustrates this in a NAND2 gate. One of its inputs is at low logical level, this means that any change in the other input, where a fault indeed occurred, will not impact the output value. Section 4.3.1 explores logical masking to reduce the number of total simulations.





Source: The author, 2024

### 2.6.2 Electrical Masking

Electrical masking happens when the fault is attenuated as it propagates throughout the circuit and does not have enough amplitude to flip the logical value of the output. Figure 11 illustrates this. The fault originating in the NAND2 input is not logically masked, but the propagated fault's amplitude in the output is not enough to cross the half voltage supply threshold. This means that it still computes as a low logical value, the correct value. Section 4.2 explains how Quasar guarantees that a non-logically masked fault is also not electrically masked.





Source: The author, 2024

#### 2.6.3 Temporal Masking

Temporal masking happens when the propagated fault does not persist in the output within the latching window of the memory element. Figure 12 illustrates this. The propagated fault reaches the output with sufficient amplitude, but it does so when the memory element is not reading, meaning that the fault is not captured. As clock frequencies increase latching windows become more frequent making temporal masking harder to happen. Currently, Quasar has no way to guarantee that a fault will not be temporally masked.

Figure 12 – Temporal Masking.



Source: The author, 2024

Finally, Figure 13 illustrates a case where no masking occurs, the fault propagates to an output with sufficient amplitude and within the timing window where a memory element is reading that output, thus changing the logical value at that element, creating an error.





Source: The author, 2024

### **3 RELATED WORK**

There are many nuances to analyzing circuit reliability to SET effects. Some works adapt Soft Error Rate (SER) as a metric of reliability (RAJARAMAN et al., 2006; HAMAD; MOHAMED; SAVARIA, 2016; LI; DRAPER, 2016), some use the minimal LET to represent the critical charge of the circuit (ANDJELKOVIC; KRSTIC, 2024b), while others user the pulse width of the fault at the output to determine SET sensitivity. Circuit analysis can be done through fully analytical (RAJARAMAN et al., 2006) methods or can be simulation based (AGUIAR et al., 2016; V; MITTAL; KUMAR, 2023). Simulation based approaches usually are based on TCAD, electrical or RTL simulations. Finally, different methodologies have different targets of analysis. Some intend to precisely evaluate logic gated response to radiation (ANDJELKOVIC; KRSTIC, 2024b; LI; DRAPER, 2016), while others take a whole circuit into account (RAJARAMAN et al., 2006; LI; DRAPER, 2016; AGUIAR et al., 2016). Some approaches only take into consideration specific transistor devices, while others only consider specific architectures such as Field-Programmable Gate Arrays (FPGA) (AZIMI et al., 2018).

Different approaches can be used to help to determine the circuit response to radiation. Some works explore the use of Machine Learning (ML) and LookUp Tables (LUT) while others are straight forward in the use of simulations.

In this chapter other radiation evaluation tools will be explained. Four tools and methodologies were chosen to have a deepened discussion in their own session, other works will be briefly overviewed at the end, closing the session with a table comparing all works.

#### 3.1 Soft Error Analysis Toolset (SEAT)

Soft Error Analysis Toolset (SEAT) is a Toolset developed by the Pennsylvania State University to evaluate the effects of SET effects on digital circuits. The developed Tool SEAT-Logic Analyzer (SEAT-LA) (RAJARAMAN et al., 2006) proposes an analytical method to determine circuit response to SET effects. Each logic gate analyzed is characterized regarding current-voltage characteristics. A SET is modeled as a triangular/trapezoidal voltage signal, which is propagated thorough the circuit using the model derived from the characterized gates for a specific analyzed technology. The main difference between Quasar and SEAT-LA is that the latter does not calculate the minimal LET necessary to propagate a fault, instead it calculates the SER of the circuit. This is a very important difference, as Quasar allows for circuit designers to determine the weak points of circuit and how they compare to each other.

Experimental results of the tool are demonstrate for the benchmark ISCAS 85 C17 (C17, 1985), very much like this work will do later. Also, the tool uses Synopsys HSPICE<sup>®</sup> to model the same faults of our proposal and compare results obtained from their methodology to the ones given by the electrical simulator. In later works, SEAT is used

to evaluate how some types of variability impact circuit reliability (RAMAKRISHNAN et al., 2007). Process variability is considered, including  $V_{th}$ , and SER are determined for evaluated circuits, including C17.

## 3.2 IHP METHODOLOGY

Although not named, the Leibniz Institute for High Performance Microeletronics (IHP) developed a methodology for characterizing circuits response to radiation (AND-JELKOVIC, Marko et al., 2019). The Methodology operates at the electrical level using SPICE simulations. It takes a databases of cells to be evaluated and determines both the  $\text{LET}_{th}$  and pulse width model for all the cells. It uses the Messenger model explained in Section 2.5 to derive critical charge and a different bias dependent model to determine pulse width.

The same methodology is later used to efficiently generate compact LUT of characterized gates (ANDJELKOVIC; KRSTIC, 2024a). This database can later be used to make a full circuit evaluation. The gates evaluated with this methodology are implemented with IHP's own 130 nm CMOS transistors.

Quasar has a few differences with this methodology. The main one is Quasar's autonomy to fully evaluate a circuit, not only gates. Quasar also fully automates variability analysis, which this tool does not.

#### 3.3 POLYTECHNIQUE MONTREAL RTL METHODOLOGY

The Groupe de Recherche en Microélectronique et Microsystèmes in Montreal, Canada, published multiple papers analysing SET effects at an RTL level of abstraction. In (HAMAD; MOHAMED; SAVARIA, 2016), it is presented a methodology for analyzing fault propagation paths is presented. It uses gate libraries already characterized with radiation response, like the ones generated by the previous methodology. It applies the use of Multiway Decision Graphs (MDG) to help determine the input states that allow a fault to propagate an error to an output. In (HAMAD; HASAN, et al., 2014b,a) a more abstract model for SETs is proposed using a similar MDG approach.

Other works of the group analyze other characteristics of propagation paths at an electrical level, particularly how reconvergent paths can broaden a SET pulse (HAMAD; HASAN, et al., 2013). Finally the group also developed the RASVAS methodology (HAMAD; MOHAMED; SAVARIA, 2015), which also operating at an electrical level allows to determine the probability a SET fault will propagate to an output considering circuit input state and pulse broadening while utilizing characterized gate libraries and a simplified SET model.

The RTL approach for circuit reliability presented in this group's work differs from most of the other by working in a high level of abstraction. The logical abstraction and propagating path searching is similar to Quasar's logical validation present at Section 4.3.1, although the former is much more in depth as it models the fault, not considering it only a boolean state change. Their Methodology also does not calculate critical charge, instead the results imply SER through the fault propagation probabilities for the circuit.

#### 3.4 AGUIAR'S TOOL

A similar tool has been developed by our research group to serve a similar purpose (AGUIAR et al., 2016). Like Quasar, Aguiar's tool operates at an electrical level using a SPICE simulator and messenger's double exponential model for a radiation fault. Unlike Quasar, the tool allows for the analysis of permanent faults, specifically Stuck-On and Stuck-Open faults. The transient faults allows for the simulations of SET faults, but compared to Quasar the tool is limited. The way the tool works is by giving and interface to designers that let them chose a faulty node, insert a LET and analyses the propagated fault at an output. Quasar is more complete, as it automatically identifies all possible fault cases and determines the minimal LET for all of them without needing designer input. Quasar also allows for automatic process variability analysis which, although possible in Aguiar's tool by manually changing model parameters, is not done automatically.

#### 3.5 OVERVIEW

There are several other tools that help with radiation robustness evaluation.

Azimi et al. (2018) proposes SETA, a tool that specifically analyses SETs in FPGAs. It provides no automation or metric to determine robustness, rather, it allows for designers to insert parameters for SET simulation. The model is simplified, but strikes a balance between a high level architecture without giving up electrical simulation.

V, Mittal, and Kumar (2023) developed a tool that employs ML trained on 14 nm FinFET TCAD simulations to predict the SETs pulse shape. Although there is no metric to determine when a fault becomes an error, the work shows with promising results how ML can be employed to more precisely evaluate the impacts of radiation.

Li and Draper (2016) presents an approach based on LUTs and electrical simulation to accelerate the SER estimation on circuits. It uses the double exponential model for SETs and a memoization method to calculate the SER of a circuit based on the SET of its composing gates.

Finally, Table 4 shows a compiled comparison of the related work with Quasar. The main points of comparison are:

• Abstraction level, is the scope of analysis, can range from transistor level with TCAD simulations, to a high level of abstraction, such as RTL;

- **Simulation level**, which is highly correlated with the scope of abstraction but can have variation, specially at circuit level;
- Fault Model, which are usually Messenger's double exponential or the simpler Trapezoidal, but can be Custom or even use the TCAD output when operating at TCAD level;
- **Robustness Metric**, which indicated how the tool or methodology assess the vulnerability of the analyzed object. A tool with no robustness metric has no way to automatically asses robustness;
- **Process Variability**, some tools have built in support for analyzing the impacts of process variability, some do not;
- **Strategy**, the main method implemented to help analyzing radiation. Some works employ the use of LUTs with already characterized gates, some need the manual input of the designer to test for SET values, among others.

| Work                                                                      | Abstraction<br>Level | Simulation<br>Level | Fault<br>Model                            | Robustness<br>Metric                | Process<br>Variability | Strategy                                         |
|---------------------------------------------------------------------------|----------------------|---------------------|-------------------------------------------|-------------------------------------|------------------------|--------------------------------------------------|
| SEAT,<br>Rajaraman<br>et al. (2006)                                       | Gate &<br>Circuit    | Analytical          | Trapezoidal                               | SER                                 | No                     | Analytical                                       |
| IHP,<br>Marko<br>Andjelkovic<br>et al. (2019)                             | Gate                 | Electrical          | Double<br>Exponential<br>&<br>Trapezoidal | Critical<br>Charge &<br>Pulse Width | Yes                    | LUT                                              |
| Polytechnique<br>Montreal,<br>Hamad,<br>Mohamed,<br>and Savaria<br>(2016) | RTL                  | RTL                 | Custom                                    | SER                                 | No                     | Multiway<br>Decision<br>Graph                    |
| Aguiar et al. $(2016)$                                                    | Gate &<br>Circuit    | Electrical          | Double<br>Exponential                     | None                                | No                     | Manual                                           |
| SETA,<br>Azimi et al.<br>(2018)                                           | FPGA                 | Electrical          | Trapezoidal                               | None                                | No                     | Manual                                           |
| V, Mittal,<br>and Kumar<br>(2023)                                         | Transistor &<br>Gate | TCAD                | TCAD                                      | None                                | Yes                    | ML                                               |
| Li and<br>Draper<br>(2016)                                                | Gate &<br>Circuit    | Electrical          | Double<br>Exponential                     | SER                                 | No                     | LUT                                              |
| Quasar                                                                    | Gate &<br>Circuit    | Electrical          | Double<br>Exponen-<br>tial                | Critical<br>Charge                  | Yes                    | Logical<br>Simulation<br>& Crit<br>LET<br>Search |

Table 4 – Related Work.

Source: The Author, 2024.

#### 4 TOOL PROPOSAL AND DEVELOPMENT

Quasar's main objective is to provide an easy and accessible way for designers to automatically characterize a circuit radiation response. It automatically determines how a SET fault can cause an error and determine robustness metrics, such as the minimal LET for every relevant configuration in which a fault can occur. It also fully automates a way to evaluate the impact of variability in the radiation response of a circuit, outputting graphs plotting this relationship.

Fig. 14 presents an overview of the layer model adopted in the Quasar development. It has four layers of abstraction: Layer 0 - Fault Modeling, makes a single call to the electrical simulator and simulates a single SET fault; Layer 1 - Critical LET Search, makes multiple serial calls to layer 0 to determine the minimal LET of a SET configuration; Layer 2 - Circuit Level Evaluation, makes multiple parallel calls to layer 1 to gather all the Minimal LET of all SET configurations of a circuit thus gathering the minimal of them all, the LET<sub>th</sub>; Layer 3 - Variability Evaluation, makes multiple parallel calls to layer 2 to determine the LET<sub>th</sub> of a given circuit under some parameters variability. The development and concepts adopted in each layer are presented in the following sections.





Source: The author, 2024.

The development of Quasar is integrated with a SPICE simulator to make electrical level simulation and provides a user interface to make full circuit evaluations. It currently can be integrated with two electrical simulators: NGSPICE an open source electrical simulator, and Synopsys HSPICE<sup>®</sup>. Also, the implementation is independent of a specific transistor model or type. The tool is fully implemented in the Python language, using mainly and object-oriented approach. The open source development and modularity adopted in the Quasar code allows easy integration with other electrical simulators as well as the integration and adaption to other technology particularities.

#### 4.1 LAYER 0 - SET MODELING

For a SET to create an error, the current pulse generated on the faulted node has to propagate with significant amplitude to an output interface. Therefore the abstraction created by this layer is a function that inputs the faulted node, an output interface, the current pulse, and the circuit input vector, and then returns the minimum/maximum voltage in the output interface if it was in high/low logic state, respectively.

The SET fault's current pulse are generated by a double exponential current source inserted between the faulted node and the ground (gnd) node as described in Section 2.5. This layer provides no guarantee that the fault will not suffer any kind of masking.

One SET fault insertion and propagation is done within a single electrical simulation. The vast majority of simulator calls in Quasar simulate a SET and the number of SET simulations is considered the main bottleneck of performance. Most optimizations try to reduce the number of total SET simulations which can reach hundreds of thousands in some cases. Section 4.3.1 presents the most impactful optimization implemented so far, which significantly reduces the number of simulations needed to generate a circuit level evaluation. Internally, Quasar works with the value of the pulse's current, only converting it at the end of the process to the critical charge and LET results.

#### 4.2 LAYER 1 - CRITICAL LET SEARCH

The difference among the layers is concerning the abstraction provided for the fault simulation and evaluation. Layer 1 provides more details than Layer 0, considering now that, for a SET fault to become an error, it needs to propagate to an output interface and cause a bit flip in it, which is considered to have happened when the voltage of the output interface reaches half the supply voltage value. Thus, in Layer 1, a fault configuration is defined as a combination of a faulted node, output interface, and circuit state regarding its inputs. The responsibility of this layer is to guarantee the finding of the minimal LET that a SET needs to not be electrically masked.

#### 4.2.1 Electrical Validation

Firstly before determining the critical LET of a fault configuration it is important to determine if this configuration is not logically masked. During the calling of this layer, it is possible to inform if a fault configuration is safe, that is, not logically masked. If this does not happen, before searching for the critical LET, logical masking is determined electrically. This means two extra simulations have to be done: 1) to get the logic state of both the faulted node and output, information that is known beforehand in the safe case, and 2) to verify logical masking. Achieving the latter is done by inserting an extremely high current pulse in the faulted node. If no change is detected at the output interface, the configuration is invalid and the Critical LET Search stops returning a Null LET.

#### 4.2.2 Critical LET Search

With the validity of the configuration confirmed, a search for the critical LET is done. The search takes into consideration the output voltage response to transient faults. The search is done for a single faulted node, output node and input state and uses the abstraction provided from the Set Modeling layer to abstract the whole electrical response simulation as a function that takes a current as an input and gives a voltage as an output, as presented in Equation 2.

Figure 15 shows a simplified overview of the Critical LET Search process. The boxes in dark gray represent the calling of the previous layer. This layer is abstracted as a function that inputs a fault configuration and returns the minimal LET to propagate the fault.





As the objective of the LET Search is to find the value of current that makes output voltage equal to half the supply voltage, this value is subtracted from the output, so the target voltage is 0. This makes so the target value is the root of the function.

$$Voltage = f(Current) \tag{2}$$

For a valid fault configuration, that is, one that allows a sufficient current pulse to change the output voltage, the shape of this function resembles a step, as shown in Figure 16. This happens because for low current values the fault is electrically masked, so there is no change to the output voltage nominal value, and for high enough currents the fault propagates with sufficient intensity and width that the output voltage is completely flipped.

Although there is some insignificant fluctuation, due to electrical properties, the function overall is monotonic, either increasing or decreasing depending on the nominal output voltage. Figure 17 shows a zoom of the transition zone of the function of Figure 16. Various experiments were made and it was found that this shape always resembles a sigmoid function. This shape is favorable for root searching as it approximates a linear function at the root.



Figure 16 – Output Voltage Response to Current Pulse Increase.

Source: The author, 2024.

Figure 17 – Output Voltage Response to Current Pulse Increase Transition Zone.



Source: The author, 2024.

Different approaches were used to find the root. Originally, before the shape of the function was completely understood, a Bisection search was done. Starting from an upper and lower bound, in each iteration the midpoint is calculated and substitutes the appropriate bound. This cuts the search space in half in each iteration, which guarantees a the finding of the root in a set time given a fixed precision and initial bounds.

The Bisection, although simple and reliable, does not exploit the shape of the function to find the root. After the shape was understood a Secant search approach was implemented. This search takes two points and creates a line passing through both. The point where the line crosses the horizontal axis will give the next current, and the voltage will be calculated from it. The oldest of the two points will be discarded and the new point will be used in its place. This process is continued until the root is found. The Secant search itself is only viable on the transition zone of the function, a line created from two point on a plateau would be almost horizontal and not give meaningful information. Therefore a Bisection is done until the transition zone is found, and then a secant Search is done. This approach if faster than the standard Bisection, but sometimes diverges as some points fall slightly out of the transition zone, in these the Secant search is halted and a standard Bisection is done. The Secant search is faster because in the transition zone is found to the transition zone the function approaches a linear shape, so if both points are in the zone the next point will be very close to the root.

Finally, a third and final method was implemented, the False Position search. It is very similar to the Secant search in the sense that it creates a line between both points and takes the point it crosses the horizontal axis as the new one. The main difference is that instead of discarding the oldest of the two original points it discards the point that falls on the same side the new point fall as a Bisection would do. This guarantees that the root is always bounded by the two points, so it never diverges, but also has the advantage of exploiting the linear shape of the function, as the Secant search would do.

To measure the effectiveness of each root search method they were applied to find the critical LET of all fault configurations of the standard mirror full adder design, totaling 53. The precision of this search to find a current within a 1 mV, and the bounds of the search start at 100 and 200 nA, they are expanded if needed. Table 5 shows the number of simulations needed to get this data for the circuit, as well as the average number per simulation configuration.

| Method         | Total Simulations | Avg. Simulations per Config. |
|----------------|-------------------|------------------------------|
| Bisection      | 640               | 12.08                        |
| Secant         | 413               | 7.79                         |
| False Position | 380               | 7.17                         |

Table 5 – Number of Simulations to find determine Critical LET.

Source: The Author, 2024.

In the best case, the number of simulations per configuration would be exactly one. If the current value was predicted beforehand only the single simulation would be needed to confirm the value. In the case a prediction falls within the transition zone, but not close enough to the root, it might be possible to reliably determine where the root would be due to the linearity of the function, totaling two simulations, the starting one, and the second to confirm the root. Finally with no prediction, two simulations are needed to set the bounds and then a new simulation per iteration. Currently False Position takes on average five iterations to find the root with great precision. Without a prediction method, it would be hard to significantly minimize the number of simulations.

#### 4.3 LAYER 2 - CIRCUIT LEVEL EVALUATION

The LET<sub>th</sub> of a circuit is defined as the minimal LET necessary to cause a bit flip in an output interface in the most sensitive fault configuration. To find the LET<sub>th</sub>, every valid fault configuration has to be considered. Thus, this layer responsibility is to find the LET<sub>th</sub>. It is possible to simply generate every combination of factors that composes the fault configuration and allow the Critical LET Search to determine its validity electrically. However, this is not ideal, as the main bottleneck of performance is the number of electrical simulations done, and in electrical validation, two extra simulations are done for every possible fault configuration. As an illustration, a test was run using the default parameters of the search for the minimum LET and a Mirror Full Adder as a benchmark. It was observed that, with this method, 41% of all simulations done were only for validating fault configurations.

Firstly, there is a simple filter to discard impossible faults. If a node is not connected to a terminal of a PMOS device, it cannot undergo a 0-1-0 fault, likewise, the same is true for NMOS and 1-0-1 (DUAN; WANG; LAI, 2011). Thus, every fault configuration with these properties can be discarded before any further consideration. To avoid unneeded electrical validation, a logical validation of all fault configurations is done before any electrical simulation.

#### 4.3.1 Logical Validation

To logically validate a set fault, the circuit is modeled as a logical graph-like structure. Every electrical node is modeled as a Boolean node and for every transistor an undirected edge is created between both terminals (source and drain). This edge has a Boolean state to indicate whether it conducts or not depending on a control node state.

Figure 18 shows the circuit and the graph model for a NAND2 gate. The circle in the middle of each edge indicates which node's state controls conduction. A white/black circle means it conducts when the control node is in low/high logic state, and doesn't in the high/low state, respectively.

Figure 19 shows the workflow of this layer: it inputs a circuit and outputs all valid configurations with their respective minimal LETs, named the circuit's radiation profile.



Figure 18 – NAND2 Gate Transistor Topology and Graph Model.

Figure 19 – Variability Evaluation Layer.



Source: The author, 2024.

With the graph build all possible fault configurations are modeled in it. In order for the simulation to be correct, it is necessary that, for every transistor, the state of their gate is determined before the states of their terminals, the following model assures this fact.

The first step to the simulation is to split the graph in electrical path group. An Electrical Path Group (EPG) is defined as a group of nodes that connect to each other trough source/drain terminals of a transistor, so there is an electrical path from every node of the group to every other node, even if the path is not conducting. Voltage sources are not considered in this definition and do not belong to any EPG. Figure 20 shows the graph model of a classical Mirror Full Adder, some node names were omitted to avoid visual

pollution, and voltage source nodes were replicated as they do not belong to any group. In the graph, these EPGs can be visually identified by the seven disjoint sub graphs.

Figure 20 – Mirror Full Adder Graph Model.



Source: The author, 2024.

Another graph with a node representing each EPG is created. For every transistor in the circuit an arc is created going from the group containing the gate node to the group containing the source and drain nodes. For the Mirror Full Adder example this results on the graph shown in Figure 21 where each EPG was labeled with the name of one node in it. In the EPG graph a Topological Sorting algorithm is run, to give an order to which the EPGs will be logically simulated. This order is crucial as it guarantees that for every transistor, the logical state of its gate will be determined before the state of its source and drain are even considered. If a sorting cannot be done there will be no logical validation, all fault configurations will be marked as unsafe and electrically validated.

Figure 21 – Mirror Full Adder Electrical Path Groups Graph.



Source: The author, 2024.

If an order is possible, all fault configurations will be modeled in the original circuit graph. Every node logical value is set to undefined, except for voltage supply and input nodes which will be set according to their logical values. Then, for each EPG, logical signal will be propagated from these nodes. At the end of the logical simulation the logical value for every node will be known.

Algorithm 1 presents the process to set all logical values in a EPG. It starts from a list of all nodes that already have a known value and runs a depth first search. It takes the node on top of the stack and propagates its signal through every conducting transistor that has a drain/source connected to it, and adds it to the search list. By the end every node in the EPG will have the appropriate logical value.

To simulate a fault, every EPG starting from the faulted one will be simulated again. The faulted EPG will run Algorithm 1 with the faulted node starting on top of the stack with a set value. This assures that it will propagate its signal to every node possible before other signal source are considered. This is important, because when considering an arbitrarily large fault, its signal will have priority propagating. When the full circuit is simulated, if an output has a different logical value than its original one then fault propagated to it, this means the fault configuration is valid. Every fault configuration is tested in this way. Finally, all valid configurations are cached, so, if another Circuit Level Evaluation is done for the same circuit the configurations are known beforehand.

| Alg | gorithm 1 Logical Setting of an Electrical Path Group.                            |
|-----|-----------------------------------------------------------------------------------|
| 1:  | <b>function</b> SetLogic( <i>EPG</i> , <i>faultedNode</i> )                       |
| 2:  | $depthFirstSearch \leftarrow [vdd, gnd, faultedNode]$                             |
| 3:  | $seenNodes \leftarrow []$                                                         |
| 4:  | while not $Empty(depthFirstSearch)$ do                                            |
| 5:  | $node \leftarrow \text{PopBack}(depthFirstSearch)$                                |
| 6:  | if node in seenNodes then continue                                                |
| 7:  | for each $transistor$ in $Transistor(node)$ do                                    |
| 8:  | if not Conducting(transistor) then continue                                       |
| 9:  | $otherNode \leftarrow transistor(node)$                                           |
| 10: | if $otherNode$ not in $EPG$ then continue                                         |
| 11: | $\operatorname{LogicState}(otherNode) \leftarrow \operatorname{LogicState}(node)$ |
| 12: | ${\it InsertBack}(depthFirstSearch, otherNode)$                                   |

#### 4.3.2 Parallelization Framework

Due to the Critical LET Search of configurations being interdependent, they can be parallelized. For this purpose, a parallelization framework was developed. It takes a function and a list of function inputs. In this case, the Critical LET Search function and all valid fault configurations. A number of processes are created depending on the running machine specification. A pool of inputs and a pool of outputs are created from which process request inputs and post results. The framework can be configured to regularly log the results, which is useful for circuits with hundreds of transistors or for the Variability Evaluation, the last layer of abstraction, in which the full evaluation may take longer.

Figure 2 shows the components of the framework and Algorithm 2 shows the main operation of both worker and master processes. The worker behavior is simple, it requests an input from the input pool, runs the given function and posts the output on the output pool. If the input pool is empty the worker process terminates. The master is responsible for the coordination of results. First, it gets the function inputs either from a backup or for the object construction and initializes both pools with the respective contents. Secondly it creates and starts all worker processes. Finally, while there are jobs to be done it periodically backups both pools, in case the program fails unexpectedly.

Figure 22 – Parallelization Framework.



Source: The author, 2024.

#### Algorithm 2 Parallel Framework Worker and Master Main Procedures.

```
1: function WORKERMAIN(Function, stacticArgs, inputPool, outputPool)
2:
       while not Empty(inputPool) do
3:
           funcInput \leftarrow GetInput(inputPool)
           funcOutput \leftarrow Function(stactigArgs, funcInput)
 4:
5:
           PostResult(outputPool, funcOutput)
6:
7: function MASTERMAIN(Function, stacticArgs, inputList)
       storedData \leftarrow GetBackup()
8:
9:
       if storedData is empty then
           inputPool \leftarrow inputList
10:
11:
           outputPool \leftarrow []
12:
       else
           inputPool \leftarrow GetInputs(storedData)
13:
14:
           outputPool \leftarrow GetOutputs(storedData)
       totalJobs \leftarrow Size(inputPool) + Size(outputPool)
15:
16:
       workers \leftarrow CreateWorkers()
       for each worker in workers do
17:
           WorkerMain(Function, stacticArgs, inputPool, outputPool)
18:
       while Size(outputPool) < totalJobs do
19:
20:
           Sleep()
           Backup(inputPool, outputPool)
21:
       return outputPool
```

As a example of the paralelization optimization, a evaluation of the execution time is conduced for the benchmark C17 from ISCAS 85 benchmarks (C17, 1985). The C17 is a circuit with 24 transistors and 182 valid fault configurations. This experiment observes four possible execution lines of the Quasar tool: 1) with all electrical validation executed in serial mode; 2) with all electrical validation running parallel approach; 3) introducing logical validation on the serial execution; and 4) with logical validation and parallel approach explored together. All tests were run using HSPICE as the electrical simulator on a AMD Ryzen 5 3600 6-Core Processor, a machine with 12 threads and a maximum clock of 3.6 GHz. Table 6 shows the execution time for the four considered scenarios. The parallel execution with logical validation. Moreover, even on the serial case, a speedup of 6 was obtained with the logical validation, demonstrating the great gain in performance.

Table 6 – Circuit Level Evaluation Execution Time.

|                              | Parallel          | Serial        |  |  |
|------------------------------|-------------------|---------------|--|--|
| Logical Validation           | 48 s              | $288~{\rm s}$ |  |  |
| <b>Electrical Validation</b> | $104 \mathrm{~s}$ | $616~{\rm s}$ |  |  |
| Source: The Author 2024      |                   |               |  |  |

Source: The Author, 2024.

#### 4.4 LAYER 3 - VARIABILITY EVALUATION

A view of the operation of Variability Evaluation is presented in Figure 23. To evaluate how variability impacts the circuit robustness, firstly, the variability parameters are defined. A physical attribute of a transistor model is chosen according to the variability effects in the specific technology. This parameter is then modified to represent a variability distribution, generally define by a Gaussian function. These distributions of values will simulate the impact due to the process variability on the deviation observed on the device behavior. The nominal value of this parameter is taken as the mean value of the distribution. A percentage and sigma ( $\sigma$ ) values are inserted to generate a standard deviation.

Finally, a number of Monte Carlo (MC) runs are inserted and the same amount of points are generated under the distribution using the mean and standard deviation. For each point, a full Circuit Level Evaluation is done, resulting in the full radiation reliability profile for the circuit in each point. Other statistics are gathered such as which node, output interface or input vector is the most sensitive depending on variability.



Figure 23 – Variability Evaluation Layer.

Source: The author, 2024.

### **5 RESULTS**

To demonstrate Quasar's usability and how it can bring further insight to circuit design, three affects radiation robustness case studies are presented. Section 5.1 presents a case study in how gate mapping affects radiation robustness under the lens of variability, results that have been published at Sandoval et al. (2023). Section 5.2 presents a case study about the impact of pull-up and pull-down network in radiation robustness. Section 5.3 presents a comparison between circuit results generated both by this work and by IHP Methodology, presented in Section 3.2.

## 5.1 GATE MAPPING CASE STUDY

During circuit design, an important decision is to define how map logical equations to a logical circuit. Equation 3 shows the logic equations for the ISCAS 85 C17 benchmark (C17, 1985). It is composed of two functions, implemented with the same set of five logic variables. There are several ways to implement these equations as a circuit. Figure 24 shows three topologies for the C17 benchmark that implement the previous functions (SANDOVAL; BRENDLER; KASTENSMIDT; ZIMPECK, et al., 2023). Circuits are labeled according to the logic gate at its outputs.

$$g1 = C \land (\neg B \lor \neg D) \lor (A \land B)$$
  

$$g2 = (\neg B \lor \neg D) \land (E \lor C)$$
(3)

To demonstrate how gate mapping and variability impact circuit robustness, all circuits were implemented in 7 nm ASAP PDK presented in Section 2.2. A 2000 point variability analysis was done taking WF as the variability parameter, a standard deviation of 5% was chosen.

Table 7 shows the main statistics for the three topologies regarding radiation sensitivity, observing the  $\text{LET}_{th}$  of the circuits without considering process variability (Nominal) and the distribution of the results considering process variability by the mean  $(\mu)$ , standard deviation  $(\sigma)$ , minimum and maximum values.

| Circuit      | Nominal | $\mu$ | $\sigma$ | Min  | Max   |
|--------------|---------|-------|----------|------|-------|
| NAND2        | 70.7    | 67.9  | 13.2     | 18.8 | 100.5 |
| NOR2         | 51.8    | 51.5  | 12.5     | 8.8  | 81.7  |
| C17_NAND     | 70.7    | 66.4  | 11.7     | 18.8 | 88.0  |
| $C17$ _Mixed | 51.8    | 49.7  | 11.4     | 8.8  | 80.5  |
| C17_NOR      | 51.8    | 49.9  | 11.3     | 8.8  | 78.3  |

Table 7 – LET<sub>th</sub> distribution in MeV.cm<sup>2</sup>/mg.

Source: The Author, 2024.



Figure 24 – C17 Topologies.

Source: (SANDOVAL; BRENDLER; KASTENSMIDT; ZIMPECK, et al., 2023)

At first glance, it is possible to note that circuits are divided in two distinct groups. The ones with 70.7 MeV.cm<sup>2</sup>/mg as its LET<sub>th</sub> and the ones with 51.8. This separation can be explained by electrical masking. Usually, the most sensitive fault configuration of a circuit will include an output of the circuit as its faulted node or a node close to an output. This is due to the fact that the fault propagation from a node far from the output will attenuate the current pulse. As the LET<sub>th</sub> only takes into consideration the most sensitive fault configuration, usually a circuit's nominal robustness will be the same as the minimal robustness of its output gates, explaining the separation.

Taking this into consideration, C17\_NAND seems like the superior topology for radiation robustness, and while this is the case at nominal conditions, variability brings further nuance to the discussion. Different circuits behave differently to variability. Figure 25 shows the LET<sub>th</sub> dispersion of both the NAND2 and NOR2 gate under WF fluctuation. Although overall the NAND2 gate is the more robust of the two, they have different responses to each device fluctuation. While the NAND2 robustness responds to variations in both the NMOS and PMOS WF fluctuation, the variation of the former impacts it much more than the latter.

For the NOR2, variations to NMOS are insignificant when compared to the PMOS. Moreover, it is important to highlight that, the most robust WF configuration of both gates is not the same. This motivates a comparison between the two taking each point into consideration. Figure 26 shows the difference between the LET<sub>th</sub> of the NAND2 and NOR2 gate at each point. Exceptionally, the dispersion reaches negative values in some configurations of WF of the devices, meaning that in those cases the NOR2 gate is the most robust version.





Figure 26 – The difference between NAND2 and NOR2  $LET_{th}$  Dispersion.



Source: (SANDOVAL; BRENDLER; KASTENSMIDT; ZIMPECK, et al., 2023)

This impacts the hierarchy of the C17 topologies, as in cases that the NOR2 is the more robust, C17\_NOR will win over C17\_NAND. Furthermore, this fact also explains why C17\_Mixed has the least average robustness. This is due to the fact that in every WF configuration it has the most sensitive gate of the two as one of its outputs. Figure 27 shows the most critical node of C17\_Mixed at each point, if a point is marked with a cross, g1 is the critical output, otherwise its g2. The ratio between the most sensitive output follows the same trend of the ratio between the dominant gate seen in Figure 26.

This case study demonstrates the utility of Quasar when analyzing radiation robustness. Furthermore, it shows why variability can be such an important factor when evaluating circuit reliability, reinforcing the relevance of the tool variability feature.



Figure 27 – C17\_Mixed Critical Node.

Source: (SANDOVAL; BRENDLER; KASTENSMIDT; ZIMPECK, et al., 2023)

#### 5.2 RESTORING NETWORK CASE STUDY

This case study has two main objectives: 1) show how Quasar can bring insight into circuit reliability to radiation at a transistor level, not only gate level; 2) explain the different behavior of the NAND2 and NOR2 gate under WF fluctuation. For this purpose, an analysis of the Inverter, NOR and NAND gates was done. For the NOR and NAND functions, it is considered the range from 2 to 4 inputs. All circuits were implemented using the 7 nm model presented in Section 2.2 with minimal sizing. Figure 18a shows the topology for the NAND2 gate, while Figure 28 shows the topology for the NOR2 and Inverter gates. In this experiment, the same devices and distribution parameters from experiment of Section 5.1 were used.





Source: The Author, 2024.

Figure 29 shows a plot of the average  $\text{LET}_{th}$  of the gates analyzed. For each gate, the vast majority of the critical nodes in the distribution corresponds to the circuit output. Each bar in the graphic shows the proportion of pulses type of the distribution, either p-hit or n-hit. In both the NAND and NOR functions, as the number of inputs increases, the gates become more sensitive. It is also observed that for the majority of the NAND cells, the n-hit is the critical pulse while p-hit is the critical for the NOR function. Both these facts are due to the same reason:- the path restoring currents take in the critical fault configuration.



Figure 29 – Average  $LET_{th}$  of the Analyzed Gates.

When charge is collected/depleted during a SET, it needs to be discharged/restored. A current needs to flow between the faulted node to either gnd or the supply node, depending on the charged type. Therefore, in most cases, the critical fault configuration will be the one that makes the restoring current taking the path of most resistance. This makes the charge takes longer to be depleted/restored hence needing less charge to propagate a fault. In nominal conditions for every gate of both NAND and NOR functions this case will happen when the restoring current needs to pass through the serial network, e.g., when a NAND2 gate has both it inputs at high level making so a fault at the output gate needs to be restored though a current that passes through the NMOS network.

Furthermore, as gate input number increases, the number of devices in the serial arrangement also increases, weakening restoring currents and increasing sensitivity. Additionally the NAND/NOR function is more dependent on NMOS/PMOS variability, respectively, because that is the transistor of its series network. The rare cases where this does not apply only occur in extremes situation when due to WF fluctuation the conductance of the series network transistor is very high while the conductance of the parallel network is very low, meaning that the parallel network will present a higher resistance than the serial.

Source: The Author, 2024.

#### 5.3 IHP METHODOLOGY COMPARISON

Finally, in order to asses Quasar correctness as a radiation reliability evaluation tool, results generated by it were compared to similar results generated with other tools. For this purpose the IHP Methodology was chosen as it is presents results in Critical Charge, an equivalent metric to the one Quasar uses, and employs a simulation and modeling different from Quasar.

The evaluated circuit was the NAND3 with the usual CMOS topology. Values from IHP's methodology were already known (ANDJELKOVIĆ, 2022), and the circuit was implemented in IHP 130 nm own PDK. Quasar implemented the circuit using the 32 nm model presented in Section 2.1, both cases used the minimal sizing for every transistor. Currently, the transistor model used by IHP is unsupported in Quasar's implementation, so a different one was used. This means that for each transistor model there will be different results due to electrical properties. However, it is still possible to compare the results by assessing how circuit state impacts robustness. All inputs states were considered and only faults originating on the output node were analyzed. Results are presented in Critical Charge instead of LET as this is the metric used by IHP. This conversion is done by Equation 1.

Table 8 shows the values found by both methodologies. The rows are ordered from the most robust state to least robust. This is the first notable similarity, the order of robustness is the same in both cases. The grouping of fault configurations is also maintained, configurations with only one input in high state have similar critical charges in both cases, the same holds for two high states. This grouping has the also happens due to the restoring currents, for every n-hit case, every input in low state represents a transistor to the source in conduction which will help with charge restoration. More transistor in conduction accelerate the restoration meaning higher critical charges. The p-hit case is the critical fault configuration due to the restoring being done by the serial network.

| Input (a b c) | IHP                      | Quasar | Ratio | Pulse Type |
|---------------|--------------------------|--------|-------|------------|
| 000           | 61.0                     | 22.9   | 2.66  | n-hit      |
| 100           | $4\bar{6}.\bar{3}$       | 15.3   | 3.02  | n-hit      |
| 010           | 46.3                     | 15.3   | 3.03  | n-hit      |
| 001           | 46.2                     | 15.3   | 3.03  | n-hit      |
| 101           | $\bar{3}\bar{1}.\bar{0}$ | 7.7    | 4.00  | n-hit      |
| 110           | 31.0                     | 7.7    | 4.00  | n-hit      |
| 011           | 30.0                     | 7.7    | 3.91  | n-hit      |
| 111           | $\bar{26.9}^{-}$         | 4.7    | 5.67  | p-hit      |

Table 8 – Critical Charge (fC) for NAND3 in different technologies and methodologies.

Source: The Author, 2024

Furthermore for every case the critical charge for the 32 nm transistor is lower than

the 130 nm, which is expected as larger planar transistor usually require a higher charge to fault. The ratio between the two methodologies also has a low variance. This shows that even though both circuits are implemented in different technologies the proportion between the faults themselves are very similar.

Finally, this comparison not only shows that Quasar is able to generate results equivalent to other technologies, but also that, even in different transistor implementations, the same circuit topology has a know behavior.

## 6 CONCLUSION

Radiation reliability presents a challenge to designers that has become evermore relevant with the miniaturization of devices. In this light this work presented the development of a radiation evaluation tool to aid circuit designers to project reliable systems.

Firstly, the main concepts behind radiation robustness adopted in this work were explained. Also, the core properties of transistors, main sources of process variability addressed and the physical phenomena behind radiation faults were presented. The proposal to implement a radiation reliability evaluation tool is not new, thus other similar works proposing such tools were explored and compared to Quasar. Compared to the related work, Quasar stands out by fully automating the robustness evaluation and providing a simple way to consider process variability, something that few researches have already combined.

In Chapter 4, the functionality of Quasar were introduced. Its architecture is divided in four layer of abstractions. The first one, which is the only that directly integrates with an electrical simulator, models the SET fault and provides an interface to easily insert it in any configuration of the circuit. The second layer automates the finding of the critical LET for a configuration, the metric used to assess robustness. This automation is one of the main contributions of this work, as it quickly solves a task that is arduous and time consuming to do manually. The third layer allows for an automatic circuit level evaluation. The logic validation is also one of the main contributions of Quasar, as, although not as tiresome as the last, is also a hard task to do manually, specially on circuits with a high number of transistors. Finally, the fourth and last layer addresses variability and provides a simple interface to assess how the variation of physical parameters impacts robustness.

The tool is practical, and the results shows that the tool allows a much quicker evaluation of radiation robustness in circuits. Results presented in Chapter 5 show practical examples of how Quasar can assist designers. These results also demonstrate how process variability can have a deep impact on robustness, even undermining the robustness hierarchy between gates. This highlights the relevance of the variability analysis to support decisions on design architecture. Furthermore, the results also demonstrate how quasar can help gain further insight on SET behavior even at the transistor scope. Finally, a comparison of Quasar results with one of the most complete tools previously discussed was made, validating Quasar results.

Still, there remains a lot of space in which Quasar can improve. As mentioned in Section 5.3, some transistor models are not supported. Furthermore, the process variability analysis is hard coded. As seen in Section 5.1, there is a continuity in the change of  $\text{LET}_{th}$ caused by the WF fluctuation. Given enough data, it is possible to estimate these  $\text{LET}_{th}$ , and used the estimations as the starting point for the Critical Let Search presented in Section 4.2.2. This would allow for a significantly faster generation of the process variability analysis. Some of the necessary framework for this feature has already been developed, but the feature still has to be created. Other improvements can be done, for example, in the current version Quasar demands the manual entry of the circuit under evaluation and the desired reports. Finally, one future objective is to make the tool interface compatible with those used in EDA tools. Then, Quasar could be integrated in automatic design flows, allowing for these tools to more easily consider radiation robustness as a whole.

#### 6.1 PUBLICATIONS

The development of this work began at the end of 2020, with the start of initial scientific activities, and has led to several published results. The initial objective of the project was the evaluation of the impact of gate mapping in circuit reliability using ISCAS85 C17 as a case study. For this purpose a very rudimentary version of Quasar was developed only to determine the LET<sub>th</sub> of the circuit at nominal condition. The first publication of this work was at  $36^{\circ}$  Simpósio Sul de Microeletrônica (SIM) encompassing a simple evaluation of five topologies of the benchmark circuit (SANDOVAL; BRENDLER; KASTENSMIDT; REIS, et al., 2021).

As my understanding of radiation effects and the importance of topologies grew, the discussions and reflections on the tool development became more in-depth. The design reason behind the different robustness of the circuits was included in the experiments and evaluations. An exploration of radiation robustness of individual gates was done as well as the evaluation of different mitigation techniques. This lead to a publication at *IEEE* 22nd Latin American Test Symposium (LATS) (SANDOVAL; BRENDLER; ZIMPECK, et al., 2021).

During 2022, the major focus was put in the development of the software. Optimizations such as the logical validation, parallelization, data analysis and variability evaluation were implemented. At the end of the year the results obtained using the variability feature were gathered and published at the 54th IEEE International Symposium on Circuits and Systems (ISCAS) (SANDOVAL; BRENDLER; KASTENSMIDT; ZIMPECK, et al., 2023).

The development of the tool continued, in late 2023 results obtained were presented in 13th IEEE CASS Rio Grande do Sul Workshop (CASSW). The work achieved the best graduation award of the symposium.

In early 2024 a big focus was put into polishing the tool. It came to a point where what once were a few python scripts became sophisticated enough to receive a name: Quasar. A paper elaborating the general flow of the tool was submitted and accepted for the *31st IEEE International Conference on Electronics Circuits and Systems* (ICECS) (SANDOVAL; BRENDLER; SCHVITTZ, et al., 2024).

Still in 2024 the tool started gaining reach. The methodology it automates is used by other members of the community that research similar topics, and thus it is useful to these researches as well. As of now Quasar is being used by two other researchers to aid the evaluation of radiation effects on circuits. One of these researchers already submitted an accepted paper that uses Quasar in its methodology to the 16th IEEE Latin American Symposium on Circuits and Systems (LASCAS) (REIS et al., 2025).

Below, we present the complete list of publications from this project to date:

- Gate Mapping and Voltage Influence on Radiation Robustness: a C17 Benchmark Case-Study, 36° Simpósio Sul de Microeletrônica, 2021 (SANDOVAL; BRENDLER; KASTENSMIDT; REIS, et al., 2021),
- Exploring Gate Mapping and Transistor Sizing to Improve Radiation Robustness: A C17 Benchmark Case-study, IEEE 22nd Latin American Test Symposium (LATS), 2021 (SANDOVAL; BRENDLER; ZIMPECK, et al., 2021),
- Impact on Radiation Robustness of Gate Mapping in FinFET Circuits under Workfunction Fluctuation, 54th IEEE International Symposium on Circuits and Systems (ISCAS), 2023 (SANDOVAL; BRENDLER; KASTENSMIDT; ZIMPECK, et al., 2023).
- Quasar Boosting the Evaluation of the Variability Effects on Radiation Sensitivity, 31st IEEE International Conference on Electronics Circuits and Systems (ICECS), 2024 (SANDOVAL; BRENDLER; SCHVITTZ, et al., 2024).
- Evaluation of Transient Fault Tolerance in Different Logic Styles of 2:1 Multiplexers, 16th IEEE Latin American Symposium on Circuits and Systems (LASCAS) (REIS et al., 2025)

## 6.2 SOURCE CODE

The source code for the tool developed is available at https://github.com/bnmfw/Quasar with instructions of use included.

#### REFERENCES

AGUIAR, Y.Q. de et al. Permanent and single event transient faults reliability evaluation EDA tool. Microelectronics Reliability, v. 64, p. 63–67, 2016. Proceedings of the 27th European Symposium on Reliability of Electron Devices, Failure Physics and Analysis. ISSN 0026-2714. DOI: https://doi.org/10.1016/j.microrel.2016.07.072.

ANDJELKOVIC, M. et al. An overview of the modeling and simulation of the single event transients at the circuit level. In: 2017 IEEE 30th International Conference on Microelectronics (MIEL). [S.l.: s.n.], 2017. P. 35–44. DOI: 10.1109/MIEL.2017.8190065.

ANDJELKOVIC, Marko; KRSTIC, Milos. A Holistic Approach for Characterization of SET Effects in a Standard Digital Cell Library. In: 2024 IEEE 15th Latin America Symposium on Circuits and Systems (LASCAS). [S.l.: s.n.], 2024. P. 1–5. DOI: 10.1109/LASCAS60203.2024.10506165.

\_\_\_\_\_. Simulation-Based Analysis and Modeling of Generated Single Event Transient Pulse Width. In: 2024 IEEE 25th Latin American Test Symposium (LATS). [S.l.: s.n.], 2024. P. 1–6. DOI: 10.1109/LATS62223.2024.10534612.

ANDJELKOVIC, Marko et al. Characterization and Modeling of SET Generation Effects in CMOS Standard Logic Cells. In: 2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design (IOLTS). [S.l.: s.n.], 2019. P. 212–215. DOI: 10.1109/I0LTS.2019.8854379.

ANDJELKOVIĆ, Marko. A methodology for characterization, modeling and mitigation of single event transient effects in CMOS standard combinational cells. Apr. 2022. S. 81. PhD thesis. DOI: 10.25932/publishup-53484.

AVIZIENIS, Algirdas. The four-universe information system model for the study of fault tolerance. **12th International Symposium on Fault-Tolerant Computing**, 1982.

AZIMI, Sarah et al. SETA: A CAD Tool for Single Event Transient Analysis and Mitigation on Flash-Based FPGAs. In: INTERNATIONAL Conference on Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design (SMACD). [S.l.: s.n.], 2018. P. 1–52. DOI: 10.1109/SMACD.2018.8434897.

BAUMANN, R. Soft errors in advanced computer systems. **IEEE Design & Test of Computers**, v. 22, n. 3, p. 258–266, 2005. DOI: 10.1109/MDT.2005.69.

C17. C17-Benchmark: c17 benchmark Verilog Code with Logic Encryption, RTL Synthesis, RTL to GDSII Analysis, TCL Scripting — github.com. [S.l.: s.n.], 1985. https://github.com/swapnilanand123/c17-Benchmark. [Accessed 01-07-2024]. CARRENO, Victor A; CHOI, G; IYER, RK. Analog-digital simulation of transient-induced logic errors and upset susceptibility of an advanced control system. **NASA Technical Memorandum 4241**, 1990.

CLARK, Lawrence T. et al. ASAP7: A 7-nm finFET predictive process design kit. Microelectronics Journal, v. 53, p. 105-115, 2016. ISSN 0026-2692. DOI: https://doi.org/10.1016/j.mejo.2016.04.006. Disponível em: https://www.sciencedirect.com/science/article/pii/S002626921630026X.

DODD, P.E.; MASSENGILL, L.W. Basic mechanisms and modeling of single-event upset in digital microelectronics. **IEEE Transactions on Nuclear Science**, v. 50, n. 3, p. 583–602, 2003. DOI: 10.1109/TNS.2003.813129.

DUAN, Xueyan; WANG, Liyun; LAI, Jinmei. Effect of charge sharing on the single event transient response of CMOS logic gates. **Journal of Semiconductors**, IOP Publishing, v. 32, n. 9, p. 095008, 2011.

FERLET-CAVROIS, Véronique; MASSENGILL, Lloyd W.; GOUKER, Pascale. Single Event Transients in Digital CMOS—A Review. **IEEE Transactions on Nuclear Science**, v. 60, n. 3, p. 1767–1790, 2013. DOI: 10.1109/TNS.2013.2255624.

HAMAD, Ghaith Bany; HASAN, Syed Rafay, et al. Investigating the impact of propagation paths and re-convergent paths on the propagation induced pulse broadening. In: 2013 14th European Conference on Radiation and Its Effects on Components and Systems (RADECS). [S.l.: s.n.], 2013. P. 1–4. DOI: 10.1109/RADECS.2013.6937387.

\_\_\_\_\_. Modeling, analyzing, and abstracting single event transient propagation at gate level. In: 2014 IEEE 57th International Midwest Symposium on Circuits and Systems (MWSCAS). [S.l.: s.n.], 2014. P. 515–518. DOI: 10.1109/MWSCAS.2014.6908465.

\_\_\_\_\_. In: 2014 IEEE 57th International Midwest Symposium on Circuits and Systems (MWSCAS). [S.l.: s.n.], 2014. P. 515–518. DOI: 10.1109/MWSCAS.2014.6908465.

HAMAD, Ghaith Bany; MOHAMED, Otmane Ait; SAVARIA, Yvon. Efficient multilevel formal analysis and estimation of design vulnerability to Single Event Transients. In: 2015 IEEE 21st International On-Line Testing Symposium (IOLTS). [S.l.: s.n.], 2015. P. 1–6. DOI: 10.1109/IOLTS.2015.7229818.

\_\_\_\_\_. Towards formal abstraction, modeling, and analysis of Single Event Transients at RTL. In: 2016 IEEE International Symposium on Circuits and Systems (ISCAS). [S.l.: s.n.], 2016. P. 2166–2169. DOI: 10.1109/ISCAS.2016.7539010.

KAHNG, Andrew B. Scaling: More than Moore's law. **IEEE Design Test of Computers**, v. 27, n. 3, p. 86–87, 2010. DOI: 10.1109/MDT.2010.71.

LI, Ji; DRAPER, Jeffrey. Accelerating soft-error-rate (SER) estimation in the presence of single event transients. In: 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC). [S.l.: s.n.], 2016. P. 1–6. DOI: 10.1145/2897937.2897976.

MEINHARDT, C.; ZIMPECK, A.L.; REIS, R.A.L. Predictive evaluation of electrical characteristics of sub-22nm FinFET technologies under device geometry variations. **Microelectronics Reliability**, v. 54, n. 9, p. 2319–2324, 2014. SI: ESREF 2014. ISSN 0026-2714. DOI: https://doi.org/10.1016/j.microrel.2014.07.023. Disponível em: https://www.sciencedirect.com/science/article/pii/S0026271414002194.

MESSENGER, G. C. Collection of Charge on Junction Nodes from Ion Tracks. **IEEE Transactions on Nuclear Science**, v. 29, n. 6, p. 2024–2031, 1982. DOI: 10.1109/TNS.1982.4336490.

MUSTAFA, M; BHAT, Tawseef A; BEIGH, M R. Threshold voltage sensitivity to metal gate work-function based performance evaluation of double-gate n-FinFET structures for LSTP technology. World J. Nano Sci. Eng., Scientific Research Publishing, Inc, v. 03, n. 01, p. 17–22, 2013.

ORSHANSKY, Michael; NASSIF, Sani; BONING, Duane S. **Design for Manufacturability and Statistical Design**. 2008. ed. New York, NY: Springer, Dec. 2007. (Integrated Circuits and Systems).

PENG, Chao et al. Investigating Neutron-Induced Single Event Transient Characteristics by TCAD Simulations in 65 nm Technology and Below. In: 2019 3rd International Conference on Radiation Effects of Electronic Devices (ICREED). [S.l.: s.n.], 2019. P. 1–4. DOI: 10.1109/ICREED49760.2019.9205173.

RABAEY, Jan M. Digital Integrated Circuits: A design perspective. [S.l.]: Prentice-Hall, 2003.

RAJARAMAN, R. et al. SEAT-LA: a soft error analysis tool for combinational logic. In: 19TH International Conference on VLSI Design held jointly with 5th International Conference on Embedded Systems Design (VLSID'06). [S.l.: s.n.], 2006. 4 pp.-. DOI: 10.1109/VLSID.2006.143.

RAMAKRISHNAN, K. et al. Variation Impact on SER of Combinational Circuits. In: 8TH International Symposium on Quality Electronic Design (ISQED'07). [S.l.: s.n.], 2007. P. 911–916. DOI: 10.1109/ISQED.2007.168.

REIS, Ana Flávia et al. Evaluation of Transient Fault Tolerance in Different Logic Styles of 2:1 Multiplexers. In: 2025 IEEE Latin American Symposium on Circuits and Systems (ICECS). [S.l.: s.n.], 2025. P. 1–5.

SAHA, Samar K. Modeling Process Variability in Scaled CMOS Technology. **IEEE Design & Test of Computers**, v. 27, n. 2, p. 8–16, 2010. DOI: 10.1109/MDT.2010.50.

SANDOVAL, Bernardo Borges; BRENDLER, Leonardo H.;

KASTENSMIDT, Fernanda L.; REIS, Ricardo, et al. Gate Mapping and Voltage Influence on Radiation Robustness: a C17 Benchmark Case-Study. In: 35° Simpósio Sul de Microeletrônica. [S.l.: s.n.], 2021. P. 1–4.

SANDOVAL, Bernardo Borges; BRENDLER, Leonardo H.;

KASTENSMIDT, Fernanda L.; ZIMPECK, Alexandra L., et al. Impact on Radiation Robustness of Gate Mapping in FinFET Circuits under Work-function Fluctuation. In: 2023 IEEE International Symposium on Circuits and Systems (ISCAS). [S.l.: s.n.], 2023. P. 1–5. DOI: 10.1109/ISCAS46773.2023.10181528.

SANDOVAL, Bernardo Borges; BRENDLER, Leonardo H.; SCHVITTZ, Rafael B., et al. Quasar -Boosting the Evaluation of the Variability Effects on Radiation Sensitivity. In: 2024 IEEE International Conference on Electronics Circuits and Systems (ICECS). [S.l.: s.n.], 2024. P. 1–5.

SANDOVAL, Bernardo Borges; BRENDLER, Leonardo Heitich; ZIMPECK, Alexandra L., et al. Exploring Gate Mapping and Transistor Sizing to Improve Radiation Robustness: A C17 Benchmark Case-study. In: 2021 IEEE 22nd Latin American Test Symposium (LATS). [S.l.: s.n.], 2021. P. 1–6. DOI: 10.1109/LATS53581.2021.9651798.

SHIVAKUMAR, P. et al. Modeling the effect of technology trends on the soft error rate of combinational logic. In: PROCEEDINGS International Conference on Dependable Systems and Networks. [S.l.: s.n.], 2002. P. 389–398. DOI: 10.1109/DSN.2002.1028924.

V, Vibhu; MITTAL, Sparsh; KUMAR, Vivek. Machine Learning-based model for Single Event Upset Current Prediction in 14nm FinFETs. In: 2023 36th International Conference on VLSI Design and 2023 22nd International Conference on Embedded Systems (VLSID). [S.l.: s.n.], 2023. P. 1–6. DOI: 10.1109/VLSID57277.2023.00048.

WANG, Xingsheng et al. Statistical variability and reliability in nanoscale FinFETs. In: 2011 International Electron Devices Meeting. [S.l.: s.n.], 2011. P. 5.4.1–5.4.4. DOI: 10.1109/IEDM.2011.6131494.

WESTE, Neil H. E.; HARRIS, David. CMOS VLSI Design. [S.l.]: Pearson India, 2015.

YU, Bin et al. FinFET scaling to 10 nm gate length. In: DIGEST. International Electron Devices Meeting, [s.l.: s.n.], 2002. P. 251–254. DOI: 10.1109/IEDM.2002.1175825.

ZHAO, Wei; CAO, Yu. New generation of predictive technology model for sub-45nm design exploration. In: 7TH International Symposium on Quality Electronic Design (ISQED'06). [S.l.: s.n.], 2006. 6 pp.–590. DOI: 10.1109/ISQED.2006.91.

# Quasar - Boosting the Evaluation of the Variability Effects on Radiation Sensitivity

Bernardo Borges Sandoval<sup>1</sup>, Leonardo H. Brendler, Rafael B. Schvittz<sup>2</sup> and Cristina Meinhardt<sup>1</sup>

<sup>1</sup>Departamento de Informática e Estatística - Universidade Federal de Santa Catarina (UFSC)

<sup>2</sup> Centro de Ciências Computacionais - Universidade Federal de Rio Grande (FURG)

bernardoborgessandoval@gmail.com, lhbrendler@inf.ufrgs.br, rafaelschivittz@furg.br, cristina.meinhardt@ufsc.br

Abstract—This work presents a tool developed to boost the evaluation of the variability effects on the radiation sensitivity in detail at an electrical level. The tool can handle from small basic cells to median multi gate circuits in few seconds, speeding-up the traditional fault injection mechanism based on large number of electrical simulations. The tool explores logical masking to reduce the design space exploration and parallelism to speed up the circuit level evaluation. To show the applicability of the tool, this work presents results for circuits of a 2-input XOR function in complementary and transmission gate topologies.

Index Terms—Variability Effects, Single Event Transient, EDA tool

#### I. INTRODUCTION

Radiation effects in electronic circuits become a greater concern as transistor devices decrease in scale. Design requirements about these effects that were only considered for aerospace applications become more pertinent at ground level driven by the reduction of the voltage supply, the increase in the clock frequency, and the effects of process variability [1].

A Single Event Transient (SET) is a soft error that is observed when a charged particle strikes a sensitive part of a transistor device and, through a charge collection mechanism, generates a current pulse that might propagate throughout the circuit, reaching an output interface and causing an error [2]. The SET intensity is measured in Linear Energy Transfer (LET), which represents the energy deposited by the energetic particle in the material.

At this moment, few tools are available to help designers with the evaluation of SET robustness and the impact of mitigation approaches. For example: 1) A tool for fault injection that enables the introduction of SETs into a circuit [3] lacks the capability to automatically assess the critical Linear Energy Transfer (LET) threshold required to trigger the fault; or 2) a gate-level SET characterization framework for standard cells [4], but has a narrower scope compared to tools encompassing the entire circuit. All these tools lack in considering the process variability effects on the SET evaluation.

This work presents the Quasar tool that provides a full circuit characterization in regards to radiation sensitivity considering process variability effects. A set of possible configurations that a SET can occur to propagate an error to an output interface, as well as the minimal LET of these SETs, are automatically determined. The tool can also correlate how this characterization varies according to other factors, reporting the influences of parametric and geometric variability. This characterization can help circuit designers to increase circuit reliability and make informed decisions in which parts of the circuit to target to mitigate radiation effects.

#### II. QUASAR TOOL DESCRIPTION

Quasar is implemented using the Python programming language and integrates with Ngspice and Hspice electrical simulators. Additionally, the first version of Quasar is provided by the authors in an under request repository <sup>1</sup>.

The proposed tool can be described in four layers of abstraction built on top of each other, as illustrated in Figure 1, in which each layer provides a function call for the layer above. Quasar receives as input a netlist of a circuit and provides as the output a list of every possible SET configuration with the minimal LET value needed to propagate a fault to an output interface. The development explanation starts from the lowest level of abstraction to the highest.



Fig. 1. Quasar Abstraction Layers.

#### A. Fault Modeling

In Quasar, SET faults are modeled as a double exponential current pulse originating in the node struck by the charged particle, and the LET is calculated through Messenger's Equation [5]. The collected charge timing constant  $\tau_{\alpha}$ , timing constant to establish the ion track  $\tau_{\beta}$  and charge constant that the ion particle deposits along its path are all configurable and de-faulted to 164 ps, 50 ps, and 10.8 fC/ $\mu$ m. The charge collection depth *L* is also configurable but technology-dependent. 21 nm is used for 7 nm ASAP PDK [6] and 1000 nm for 32 nm PTM Bulk CMOS [7].

<sup>&</sup>lt;sup>1</sup>Available at: https://github.com/bnmfw/Quasar

When analyzing a SET effect in circuit behavior, the criteria to determine whether it causes an error is how it impacts the circuit's output, therefore the abstraction created by this layer is a function that inputs the faulted node, an output interface, the current pulse, and the circuit input vector, returning the minimum/maximum voltage in the output interface if it was in high/low logic state, respectively. The fault model is implemented as an exponential current source inserted between the faulted node and the ground node with adequate time and current parameters. Figure 2 shows the operation of this layer. Internally, Quasar works with the value of the pulse's current, only converting it at the end of the process to the critical charge and LET results. Upon calling this layer of abstraction, the logic state of the faulted node is informed; if the faulted node is in the high logic state, the current source drains the node charge to ground (gnd), representing a 1-0-1 fault. Otherwise, the current source injects current on the node, representing a 0-1-0 fault. The SPICE electrical simulator is called, and a single electrical simulation is done.



Fig. 2. Fault Model Layer.

#### B. Critical LET Search

For a SET fault to become an error, it needs to propagate to an output interface and cause a bit flip in it, which is considered to have happened when the voltage of the output interface reaches half the supply voltage value. A fault configuration is defined as a combination of a faulted node, output interface, and circuit state regarding its inputs.

Some configurations are not valid due to logical masking. For example, in an AND2 gate, if one of the inputs is in low logical state, a fault cannot propagate from the other input to the output. When this layer is called, the fault configuration can be identified as safe if the configuration is known to be valid or unsafe if the validity is unknown. In the unsafe case, two simulations have to be done: 1) to get the logic state of both the faulted node and output, information that is known beforehand in the safe case, and 2) to verify logical masking. Achieving the latter is simply done by inserting an extremely high current pulse in the faulted node, if no change is detected at the output interface, the configuration is invalid and the Critical LET Search stops returning a Null LET. With the validity of the configuration confirmed, a binary search is done by calling the Fault Modelling layer in order to find the minimal current value that causes a bit flip at the output. The precision of this binary search is configurable but is defaulted to find a current within a 10 nA precision. Figure 3 shows

a simplified overview of the Critical LET Search process. The boxes in dark gray represent the calling of the previous layer. The layer is abstracted as a function that inputs a fault configuration and returns the minimal LET to propagate the fault.



Fig. 3. Critical LET Search Layer.

#### C. Circuit Level Evaluation

The LET<sub>th</sub> of a circuit is defined as the minimal LET necessary to cause a bit flip in an output interface in the most sensitive fault configuration. To find the LET<sub>th</sub>, every valid fault configuration has to be considered. It is possible to simply generate every combination of factors that composes the fault configuration and allows the Critical LET Search to determine its validity electrically. However, this is not ideal, as the main bottleneck of performance is the number of electrical simulations done, and in electrical validation, two extra simulations are done for every possible fault configuration. As an illustration, a test was run using the default parameters of the search for the minimum LET and a Mirror Full Adder as a benchmark. It was observed that with this method, 41% of all simulations done were only for validating the fault configuration.

To avoid unneeded electrical validation, a logical validation of all fault configurations is done before any electrical simulation. For this purpose, the circuit is modeled as a graphlike structure. Every electrical node is modeled as a Boolean node and for every transistor an undirected edge is created between both terminals (source and drain). This edge has a Boolean state to indicate whether it's conducting depending on a control node state. Figure 4 shows the circuit and the graph model for a NAND2 gate. The circle in the middle of each edge indicates which node's state controls conduction. A white/black circle means it conducts when the control node is in low/high logic state, and doesn't in the high/low state, respectively. All fault configurations are modeled in the graph. The logic state of the faulted node is flipped representing the fault, if it propagates and the desired output has its logical value flipped, the configuration is valid.

Furthermore, if a node is not connected to a terminal of a PMOS device, it cannot suffer a 0-1-0 fault, and the same is true for NMOS and 1-0-1 [8]. Configurations that involve these scenarios are discarded before the logic simulation step. Once all configurations are validated, they are cached. So, if the process is repeated, there is no need to validate all configurations. Figure 5 shows an overview of this layer



Fig. 4. NAND2 gate modeling.

operation. This layer is abstracted as inputting a circuit and outputting a list of all fault configurations with their respective minimal LETs, the circuit radiation profile.



Fig. 5. Circuit Level Evaluation Layer.

All valid fault configurations have their minimal LET determined, thus determining the LET<sub>th</sub> of the circuit. Due to the Critical LET Search of configurations being interdependent, they can be parallelized. For this purpose, a parallelization framework was developed. It takes a function and a list of function inputs. In this case, the Critical LET Search function and all valid fault configurations. A number of processes are created depending on the running machine specification. A pool of inputs and a pool of outputs are created from which process request inputs and post results. The framework can be configured to regularly log the results, which is useful for circuits with hundreds of transistors or for the Variability Evaluation, the last layer of abstraction, in which the full evaluation can take more time.

Table I shows the execution time of the Circuit Level Evaluation of circuit C17 from ISCAS 85 benchmarks. The C17 is a circuit with 24 transistors and 182 valid fault configurations. All tests were run using HSPICE as the electrical simulator on a AMD Ryzen 5 3600 6-Core Processor, a machine with 12 CPUs and a maximum clock of 3.6 GHz. The parallel execution with logical validation presents a 12.8 speedup compared to the serial execution with electrical validation. Moreover, even on the serial case, a speedup of 6 was obtained with the logical validation, demonstrating the great gain in performance.

| TABLE I                            |         |
|------------------------------------|---------|
| CIRCUIT LEVEL EVALUATION EXECUTION | ом Тіме |
|                                    |         |

60

|                       | Parallel | Serial |
|-----------------------|----------|--------|
| Logical Validation    | 48 s     | 288 s  |
| Electrical Validation | 104 s    | 616 s  |

#### D. Variability Evaluation

The operation of Variability Evaluation is presented in Figure 6. To evaluate how variability impacts the circuit robustness, firstly, the variability parameters are defined. A Gaussian distribution is generated given an attribute of the device model as well as a standard deviation and the number of Monte Carlo (MC) runs. For each MC run, a full Circuit Level Evaluation is done, resulting in the full radiation reliability profile for the circuit in each point. Other statistics are gathered such as which node, output interface or input vector is the most sensitive depending on variability.



Fig. 6. Variability Evaluation Layer.

#### III. DISCUSSION OF THE TOOL VERSATILITY

The Quasar's goal to provide useful data about SET robustness also considering process variability already lead to useful results. These include evaluations like how different gate mapping of the same circuit can impact robustness [9] and how the Work Function (WF) fluctuation impacts the radiation reliability, including how process variability can change the critical fault configuration [10]. To illustrate how Quasar variability evaluation can bring further insight to circuit response to radiation faults, an analysis for two topologies of the 2-input XOR gate is shown, presented in Figure 7. This evaluation also shows that Quasar can be applied to complementary (CMOS) topology circuits as well as transmission gate (TG) topology circuits.

Both architectures were implemented in 7 nm and 32 nm technology. A variability evaluation was done taking the WF as the variability parameter for 7 nm using a normal distribution of 5% and 3  $\sigma$ . For the 32 nm, Threshold Voltage (VTH0) was varied using a 10% and 3  $\sigma$  distribution. Table II shows the LET<sub>th</sub> of the circuits without considering process variability (Nominal) and the distribution of the results considering process variability by the mean ( $\mu$ ), standard deviation ( $\sigma$ ), minimum and maximum values.

Evidently in 32 nm technology, the TG topology is less robust than the CMOS topology as the minimal  $\text{LET}_{th}$  of the CMOS version is significantly greater than the maximal



Fig. 7. 2-input XOR Topologies.

for the TG. However, in 7 nm technology, both gates have similar robustness at nominal conditions. Both the nominal and average value of the LET<sub>th</sub> TG topology are greater than the CMOS, meaning that, in most cases, the CMOS XOR topology evaluated is more sensitive. Similar to the results obtained in [10], this difference depends on device variability. In some cases, the process variability inverts the more robust XOR circuit. Figure 8 shows the result of subtracting the  $LET_{th}$  of the 7 nm CMOS XOR topology from the 7 nm TG XOR, each point identified by which topology is more robust. Although in most points, the difference is great, in some cases, depending on WF fluctuation, it becomes slightly negative, i.e., the CMOS XOR reaches higher robustness. This exemplifies how critical it is to have data about how radiation robustness differs depending on the nuances of the fabrication process.

TABLE II LET  $_{th}$  distribution in MeV.cm<sup>2</sup>/mg

| Topology | Technology | Nominal | $\mu$ | $\sigma$ | Min  | Max   |
|----------|------------|---------|-------|----------|------|-------|
| CMOS     | 7 nm       | 40.82   | 39.65 | 9.28     | 7.34 | 67.10 |
| TG       | 7 nm       | 44.33   | 44.99 | 11.00    | 7.68 | 79.83 |
| CMOS     | 32 nm      | 1.25    | 1.25  | 0.04     | 1.14 | 1.37  |
| TG       | 32 nm      | 0.73    | 0.72  | 0.03     | 0.63 | 0.80  |

Furthermore, not only the value of the LET<sub>th</sub> changes according to process variability but the critical fault configuration also might change. Table III shows the proportion of each critical input vector for each circuit in 7 nm. The CMOS XOR has the same sensitivity when both inputs are equal and the same sensitivity when they are different. This means that the CMOS topology is more frequently in its most sensitive state compared to its counterpart, reinstating the dominance of the TG topology over the CMOS.

#### IV. CONCLUSION

The advanced technological nodes have brought new challenges like process variability effects. On the evaluation of SET robustness, the process variability can affect the behavior of the sensitive nodes and critical charge, affecting the mitigation approaches to be applied on the circuit under evaluation.



Fig. 8. Comparison between the TG LET<sub>th</sub> and CMOS LET<sub>th</sub>.

 TABLE III

 CRITICAL INPUT PROPORTION CONSIDERING PROCESS VARIABILITY

| Input |   | 7 nm CMOS XOR       | 7 nm TG XOR         |
|-------|---|---------------------|---------------------|
| A     | B | Critical Prevalence | Critical Prevalence |
| 0     | 0 | 14.9%               | 2.3%                |
| 0     | 1 | 85.1%               | 66.8%               |
| 1     | 0 | 85.1%               | 24.8%               |
| 1     | 1 | 14.9%               | 6.1%                |

The Quasar tool provides an environment for designers to quickly evaluate the sensitivity of the circuits and compare different topologies and circuit level mitigation approaches considering process variability on the analysis. The tool is also developed to allow easy adaption to different technology parameters and variability configurations.

#### REFERENCES

- R. Baumann. Soft errors in advanced computer systems. *IEEE Design* Test of Computers, 22(3):258–266, 2005.
- [2] R.C. Baumann. Radiation-induced soft errors in advanced semiconductor technologies. *IEEE Transactions on Device and Materials Reliability*, 5(3):305–316, 2005.
- [3] Y.Q. de Aguiar et al. Permanent and single event transient faults reliability evaluation eda tool. *Microelectronics Reliability*, 64:63–67, 2016.
- [4] Marko Andjelkovic and Milos Krstic. A holistic approach for characterization of set effects in a standard digital cell library. In *IEEE Latin America Symposium on Circuits and Systems (LASCAS)*, pages 1–5, 2024.
- [5] G. C. Messenger. Collection of charge on junction nodes from ion tracks. *IEEE Transactions on Nuclear Science*, 29(6):2024–2031, 1982.
- [6] Lawrence T. Clark et al. Asap7: A 7-nm finfet predictive process design kit. *Microelectronics Journal*, 53:105–115, 2016.
- [7] Wei Zhao and Yu Cao. New generation of predictive technology model for sub-45nm design exploration. In 7th International Symposium on Quality Electronic Design (ISQED'06), pages 6 pp.–590, 2006.
- [8] Véronique Ferlet-Cavrois, Lloyd W. Massengill, and Pascale Gouker. Single event transients in digital cmos—a review. *IEEE Transactions* on Nuclear Science, 60(3):1767–1790, 2013.
- [9] Bernardo Borges Sandoval et al. Exploring gate mapping and transistor sizing to improve radiation robustness: A c17 benchmark case-study. In IEEE Latin American Test Symposium (LATS), 2021.
- [10] Bernardo Borges Sandoval et al. Impact on radiation robustness of gate mapping in finfet circuits under work-function fluctuation. In *IEEE Int. Symp. on Circuits and Systems (ISCAS)*, 2023.