{"id":25178,"date":"2017-09-05T23:23:24","date_gmt":"2017-09-05T17:53:24","guid":{"rendered":"https:\/\/www.wikitechy.com\/technology\/?p=25178"},"modified":"2017-09-05T23:23:24","modified_gmt":"2017-09-05T17:53:24","slug":"compiler-design-lexical-analysis","status":"publish","type":"post","link":"https:\/\/www.wikitechy.com\/technology\/compiler-design-lexical-analysis\/","title":{"rendered":"Compiler Design | Lexical Analysis"},"content":{"rendered":"<p>Lexical Analysis is the first phase of compiler also known as scanner. It converts the input program into a sequence of Tokens.<\/p>\n<p><a href=\"https:\/\/www.wikitechy.com\/technology\/time-complexity-where-loop-variable-is-incremented-by-1-2-3-4\/\">Lexical Analysis<\/a> can be implemented with the Deterministic finite Automata.<\/p>\n<p><strong>What is a token?<\/strong><\/p>\n<p>A lexical token is a sequence of characters that can be treated as a unit in the grammar of the programming languages<\/p>\n[ad type=&#8221;banner&#8221;].<\/p>\n<h4 id=\"example-of-tokens\"><strong>Example of tokens:<\/strong><\/h4>\n<ul>\n<li>Type token (id, number, real, . . . )<\/li>\n<li>Punctuation tokens (IF, void, return, . . . )<\/li>\n<li>Alphabetic tokens (keywords)<\/li>\n<\/ul>\n<pre>Keywords; Examples-for, while, if etc.\r\nIdentifier; Examples-Variable name, function name etc.\r\nOperators; Examples '+', '++', '-' etc.\r\nSeparators; Examples ',' ';' etc<\/pre>\n<h4 id=\"ad-typebanner\">[ad type=&#8221;banner&#8221;]<\/h4>\n<h4 id=\"example-of-non-tokens\"><strong>Example of Non-Tokens:<\/strong><\/h4>\n<ul>\n<li>Comments, preprocessor directive, macros, blanks, tabs, newline etc<\/li>\n<\/ul>\n<h4 id=\"how-lexical-analyzer-functions\"><strong>How Lexical Analyzer functions<\/strong><\/h4>\n<ul>\n<li>Tokenization .i.e Dividing the program into valid tokens.<\/li>\n<li>Remove white space characters.<\/li>\n<li>Remove comments.<\/li>\n<li>It also provides help in generating error message by providing row number and column number.<\/li>\n<\/ul>\n<p><img fetchpriority=\"high\" decoding=\"async\" class=\"aligncenter wp-image-25179 size-full\" src=\"https:\/\/www.wikitechy.com\/technology\/wp-content\/uploads\/2017\/05\/Compiler-Design.png\" alt=\"Compiler Design Lexical Analysis\" width=\"428\" height=\"261\" srcset=\"https:\/\/www.wikitechy.com\/technology\/wp-content\/uploads\/2017\/05\/Compiler-Design.png 428w, https:\/\/www.wikitechy.com\/technology\/wp-content\/uploads\/2017\/05\/Compiler-Design-300x183.png 300w\" sizes=\"(max-width: 428px) 100vw, 428px\" \/><\/p>\n<p>The lexical analyzer identifies the error with the help of automation machine and the grammar of\u00a0 the given language on which it is based like C , C++.<\/p>\n<p>Suppose we pass a statement through lexical analyzer<\/p>\n<p><strong>a\u00a0= b + c<\/strong> ; \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0It will generate token sequence like this:<\/p>\n<p><strong>id=id+id<\/strong>; \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 Where each id reference to it\u2019s variable in the symbol table referencing all details<\/p>\n<p><strong>For example,<\/strong> consider the program<\/p>\n[ad type=&#8221;banner&#8221;]\n<div class=\"code-embed-wrapper\"> <div class=\"code-embed-infos\"> <span class=\"code-embed-name\">C and C++<\/span> <\/div> <pre class=\"language-c code-embed-pre line-numbers\"  data-start=\"1\" data-line-offset=\"0\"><code class=\"language-c code-embed-code\">int main()<br\/>{<br\/>  \/\/ 2 variables<br\/>  int a, b;<br\/>  a = 10;<br\/> return 0;<br\/>}<\/code><\/pre> <\/div>\n<p>All the valid tokens are:<\/p>\n<div class=\"code-embed-wrapper\"> <div class=\"code-embed-infos\"> <span class=\"code-embed-name\">C and C++<\/span> <\/div> <pre class=\"language-c code-embed-pre line-numbers\"  data-start=\"1\" data-line-offset=\"0\"><code class=\"language-c code-embed-code\">&#039;int&#039;  &#039;main&#039;  &#039;(&#039;  &#039;)&#039;  &#039;{&#039;  &#039;}&#039;  &#039;int&#039;  &#039;a&#039;  &#039;b&#039;  &#039;;&#039;<br\/> &#039;a&#039;  &#039;=&#039;  &#039;10&#039;  &#039;;&#039; &#039;return&#039;  &#039;0&#039;  &#039;;&#039;  &#039;}&#039;<\/code><\/pre> <\/div>\n<p>Above are the valid tokens.<\/p>\n<p>As another example, consider below printf statement.<\/p>\n<p><img decoding=\"async\" class=\"aligncenter wp-image-25180 size-full\" src=\"https:\/\/www.wikitechy.com\/technology\/wp-content\/uploads\/2017\/05\/Compiler-Design-Lexical-Analysis.png\" alt=\"Compiler Design Lexical Analysis\" width=\"300\" height=\"126\" \/><\/p>\n<p>There are 5 valid token in this printf statement.<\/p>\n<p><strong>Exercise 1:<\/strong><\/p>\n<p>Count number of tokens :<\/p>\n[ad type=&#8221;banner&#8221;]\n<p><strong>Output:<\/strong><\/p>\n<pre><strong>Answer: Total number of token: 27.<\/strong><\/pre>\n[ad type=&#8221;banner&#8221;]\n<p><strong>Exercise 2:<\/strong><\/p>\n<p>Count number of tokens :<\/p>\n<p>int max(int i);<\/p>\n<ul>\n<li>Lexical analyzer first read <strong>int<\/strong> and finds it to be valid and accepts as token<\/li>\n<li><strong>max<\/strong> is read by it and found to be valid function name after reading<strong> (<\/strong><\/li>\n<li><strong>int<\/strong> \u00a0is also a token , then again<strong> i<\/strong> as another token\u00a0and finally <strong>;<\/strong><\/li>\n<\/ul>\n<pre>Answer:  Total number of tokens 7:     int, max, ( ,int, i, ), ;<\/pre>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Compiler Design &#8211; Lexical Analysis &#8211; Lexical Analysis is the first phase of compiler also known as scanner. It converts the input program into a sequence<\/p>\n","protected":false},"author":1,"featured_media":25182,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[71772],"tags":[71825,71805,71819,71807,71815,71791,71808,71773,71824,71812,71821,71787,71786,71818,71782,71789,71777,71799,71784,71783,71797,71774,71803,71781,71817,71780,71776,71828,71831,71778,71823,71820,71775,71801,71798,71793,71804,71814,71779,71809,71810,71796,71785,71795,71790,71813,71816,71800,71792,71806,71802,71794,71811,71830,71827,71788,71829,71826,71822],"class_list":["post-25178","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-compiler-design","tag-c-code-for-lexical-analysis","tag-c-code-for-lexical-analyzer","tag-c-lexical-analyzer","tag-c-program-for-lexical-analysis-in-compiler-design","tag-c-program-for-lexical-analyzer","tag-c-program-to-design-lexical-analyzer","tag-c-program-to-implement-lexical-analyser","tag-compiler-design-lexical-analysis","tag-compiler-lexical-analysis","tag-example-of-lexical-analysis","tag-generate-lexical-analyzer-using-lex","tag-implementation-of-lexical-analyser-using-lex-tool-program","tag-implementation-of-lexical-analyzer-in-compiler-design","tag-implementation-of-lexical-analyzer-using-c-program","tag-implementation-of-lexical-analyzer-using-lex-tool","tag-lex-code-for-lexical-analyzer","tag-lexical-analyser","tag-lexical-analyser-in-c","tag-lexical-analyser-in-compiler-design","tag-lexical-analyser-program","tag-lexical-analyser-using-lex-tool","tag-lexical-analysis","tag-lexical-analysis-code","tag-lexical-analysis-example","tag-lexical-analysis-example-c","tag-lexical-analysis-in-compiler","tag-lexical-analysis-in-compiler-design","tag-lexical-analysis-in-compiler-design-c-program","tag-lexical-analysis-pdf","tag-lexical-analysis-program","tag-lexical-analysis-program-in-c","tag-lexical-analysis-tools","tag-lexical-analyzer","tag-lexical-analyzer-code-in-c","tag-lexical-analyzer-example","tag-lexical-analyzer-for-sample-language-using-lex","tag-lexical-analyzer-for-sample-language-using-lex-program","tag-lexical-analyzer-generator","tag-lexical-analyzer-in-compiler-design","tag-lexical-analyzer-java","tag-lexical-analyzer-program","tag-lexical-analyzer-program-in-c","tag-lexical-analyzer-program-in-compiler-design","tag-lexical-analyzer-program-in-lex","tag-lexical-analyzer-program-using-lex-tool","tag-lexical-analyzer-source-code","tag-lexical-analyzer-using-lex","tag-lexical-analyzer-using-lex-program","tag-lexical-analyzer-using-lex-tool","tag-lexical-parsing","tag-lexical-tokens","tag-program-for-lexical-analysis","tag-program-for-lexical-analyzer","tag-regular-expression-in-compiler-design","tag-role-of-lexical-analyzer","tag-simple-lex-program-for-lexical-analyzer","tag-token-in-compiler-design","tag-what-is-syntax-analysis","tag-write-a-program-to-implement-the-lexical-analyzer"],"_links":{"self":[{"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/posts\/25178","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/comments?post=25178"}],"version-history":[{"count":0,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/posts\/25178\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/media\/25182"}],"wp:attachment":[{"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/media?parent=25178"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/categories?post=25178"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/tags?post=25178"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}