sed and awk

Author: Dale Dougherty

ISBN-10: 1565922255

ISBN-13: 9781565922259

Category: Utilities (Computer Applications)

sed & awk describes two text processing programs that are mainstays of the UNIX programmer's toolbox.\ sed is a "stream editor" for editing streams of text that might be too large to edit as a single file, or that might be generated on the fly as part of a larger data processing step. The most common operation done with sed is substitution, replacing one block of text with another.\ awk is a complete programming language. Unlike many conventional languages, awk is "data driven" — you...

Search in google:

The book begins with an overview and a tutorial that demonstrate a progression in functionality from grep to sed to awk. sed and awk share a similar command-line syntax, accepting user instructions in the form of a script. Because all three programs use UNIX regular expressions, an entire chapter is devoted to understanding UNIX regular expression syntax. Next, the book describes how to write sed scripts. After getting started by writing a few simple scripts, you'll learn other basic commands that parallel manual editing actions, as well as advanced commands that introduce simple programming constructs. Among the advanced commands are those that manipulate the hold space, a set-aside temporary buffer. The second part of the book has been extensively revised to include POSIX awk as well as coverage of three freely available and three commercial versions of awk. The book introduces the primary features of the awk language and how to write simple scripts. You'll also learn: common programming constructs; how to use awk's built-in functions; how to write user-defined functions; debugging techniques for awk programs; how to develop an application that processes an index, demonstrating much of the power of awk; and FTP and contact information for obtaining various versions of awk. Also included is a miscellany of user-contributed scripts that demonstrate a wide range of sed and awk scripting styles and techniques.

\ From Chapter 7: Writing Scripts for awk\ \ As mentioned in the preface, this book describes POSIX awk; that is, the awk language as specified by the POSIX standard. Before diving into the details, we'll provide a bit of history.\ \ The original awk was a nice little language. It first saw the light of day with Version 7 UNIX, around 1978. It caught on, and people used it for significant programming.\ \ In 1985, the original authors, seeing that awk was being used for more serious programming than they had ever intended, decided to beef up the language. (See Chapter 11, A Flock of awks, for a description of the original awk, and all the things it did not have when compared to the new one.) The new version was finally released to the world at large in 1987, and it is this version that is still found on SunOS 4.1.x systems.\ \ In 1989, for System V Release 4, awk was updated in some minor ways. This version became the basis for the awk feature list in the POSIX standard. POSIX clarified a number of things about awk, and added the CONVFMT variable (to be discussed later in this chapter).\ \ As you read the rest of this book, bear in mind that the term awk refers to POSIX awk, and not to any particular implementation, whether the original one from Bell Labs, or any of the others discussed in Chapter 11. However, in the few cases where different versions have fundamental differences of behavior, that will be pointed out in the main body of the discussion.\ \ Playing the Game\ \ To write an awk script, you must become familiar with the rules of the game. The rules can be stated plainly and you will find them described in Appendix B, QuickReference for awk, rather than in this chapter. The goal of this chapter is not to describe the rules but to show you how to play the game. In this way, you will become acquainted with many of the features of the language and see examples that illustrate how scripts actually work. Some people prefer to begin by reading the rules, which is roughly equivalent to learning to use a program from its manual page or learning to speak a language by scanning its rules of grammar--not an easy task. Having a good grasp of the rules, however, is essential once you begin to use awk regularly. But the more you use awk, the faster the rules of the game become second nature. You learn them through trial and error--spending a long time trying to fix a silly syntax error such as a missing space or brace has a magical effect upon long-term memory. Thus, the best way to learn to write scripts is to begin writing them. As you make progress writing scripts, you will no doubt benefit from reading the rules (and rereading them) in Appendix B or the awk manpage or The AWK Programming Language book. You can do that later--let's get started now.\ \ Hello, World\ \ It has become a convention to introduce a programming language by demonstrating the "Hello, world" program. Showing this program works in awk will demonstrate just how unconventional awk is. In fact, it's necessary to show several different approaches to printing "Hello, world."\ \ In the first example, we create a file named test that contains a single line. This example shows a script that contains the print statement:\ \ $ echo 'this line of data is ignored' > test\ $ awk '{ print "Hello, world" }' test\ \ Hello, world\ \ This script has only a single action, which is enclosed in braces. That action is to execute the print statement for each line of input. In this case, the test file contains only a single line; thus, the action occurs once. Note that the input line is read but never output.\ \ Now let's look at another example. Here, we use a file that contains the line "Hello, world."\ \ $ cat test2\ Hello, world\ $ awk '{ print }&39; test2\ Hello, world\ \ In this example, "Hello, world" appears in the input file. The same result is achieved because the print statement, without arguments, simply outputs each line of input. If there were additional lines of input, they would be output as well.\ \ Both of these examples illustrate that awk is usually input-driven. That is, nothing happens unless there are lines of input on which to act. When you invoke the awk program, it reads the script that you supply, checking the syntax of your instructions. Then awk attempts to execute the instructions for each line of input. Thus, the print statement will not be executed unless there is input from the file.\ \ To verify this for yourself, try entering the command line in the first example but omit the filename. You'll find that because awk expects input to come from the keyboard, it will wait until you give it input to process: press RETURN several times, then type an EOF (CTRL-D on most systems) to signal the end of input. For each time that you pressed RETURN, the action that prints "Hello, world" will be executed.\ \ There is yet another way to write the "Hello, world" message and not have awk wait for input. This method associates the action with the BEGIN pattern. The BEGIN pattern specifies actions that are performed before the first line of input is read.\ \ $ awk 'BEGIN { print "Hello, world" }'\ Hello, world\ \ Awk prints the message, and then exits. If a program has only a BEGIN pattern, and no other statements, awk will not process any input files.\ \ Awk's Programming Model\ \ It's important to understand the basic model that awk offers the programmer. Part of the reason why awk is easier to learn than many programming languages is that it offers such a well-defined and useful model to the programmer.\ \ An awk program consists of what we will call a main input look. a loop is a routine that is executed over and over again until some condition exists that terminates it. You don't write this loop, it is given--it exists as the framework within which the code that you do write will be executed. The main input loop in awk is a routine that reads one line of input from a file and makes it available for processing. The actions you write to do the processing assume that there is a line of input available. In another programming language, you would have to create the main input loop as part of your program. It would have to open the input file and read one line at a time. This is not necessary a lot of work, but it illustrates a basic awk shortcut and makes it easier for you to write your program.\ \ The main input loop is executed as many times as there are lines of input. As you saw in the "Hello, world" examples, this loop does not execute until there is a line of input. It terminates when there is no more input to be read.\ \ Awk allows you to write two special routines that can be executed before any input is read and after all input is read. These are the procedures associated with the BEGIN and END rules, respectively. In other words, you can do some preprocessing before the main input loop is ever executed and you can do some postprocessing after the main input loop has terminated. The BEGIN and END procedures are optional.\ \ You can think of an awk script as having potentially three major parts: what happens before, what happens during, and what happens after processing the input. Figure 7-1 shows the relationship of these parts in the flow of control of an awk script...

PrefaceChapter 1. Power Tools for Editing Chapter 2. Understanding Basic Operations Chapter 3. Understanding Regular Expression Syntax Chapter 4. Writing sed Scripts Chapter 5. Basic sed Commands Chapter 6. Advanced sed Commands Chapter 7. Writing Scripts for awk Chapter 8. Conditionals, Loops, and Arrays Chapter 9. Functions Chapter 10. The Bottom Drawer Chapter 11. A Flock of awks Chapter 12. Full-Featured Applications Chapter 13. A Miscellany of Scripts Appendix A. Quick Reference for sed Appendix B. Quick Reference for awk Appendix C. Supplement for Chapter 12 Index

\ From Barnes & Noble\ \ Fatbrain Review\ Serious UNIX programmers and administrators will enjoy the second edition of this best-selling book on a set of the most popular UNIX utilities, sed and awk. Why? Because it covers awk as described by the POSIX standard as well as NetBSD, FreeBSD, and the Linux versions of awk. \ The journey begins with an overview of the basic operations of sed and awk, showing a progression in functionality from grep to sed to awk. The next stop is writing sed scripts. You'll learn the syntax of sed commands, and advanced features, including multiple pattern space and hold space commands.\ The book then moves to writing scripts for awk. Discussions include pattern matching, expressions, relational and Boolean operators, and informal retrieval. The text also explains awk's built-in functions and user-defined functions. The authors keep you learn by outlining the development of an index processing application, and they offer the readers contact information on how to obtain various versions of awk. This tutorial includes a miscellany of sed and awk scripting styles and techniques.\ \ \