dbaspot
Tags Register FAQ Calendar Search Today's Posts Mark Forums Read

Re-post: How to join 2 awk commands into one, using the output of thefirst as the matching pattern of the second? - shell

This is a discussion on Re-post: How to join 2 awk commands into one, using the output of thefirst as the matching pattern of the second? - shell ; Hi, all: I posted this question several days ago. Thank you all who have answered the question. But it's still not working in my environment after trying the answers. So I re-post it now, in case the original topic might ...


Home > Database Forum > Operating Systems > shell > Re-post: How to join 2 awk commands into one, using the output of thefirst as the matching pattern of the second?

Reply

 

LinkBack Thread Tools Display Modes
  #1  
Old 08-11-2008, 09:25 AM
Database Bot
 
Join Date: Sep 2009
Posts: 1,236,254
Database Administrator is on a distinguished road
Default Re-post: How to join 2 awk commands into one, using the output of thefirst as the matching pattern of the second?

Hi, all:

I posted this question several days ago. Thank you all who have
answered the question. But it's still not working in my environment
after trying the answers. So I re-post it now, in case the original
topic might be too de-prioritized to be seen on the page.

I have 2 files now: names records

Format of names looks like:

AA.12
AA.15a
AA.16a
AA.17
AA.19
AA.1l_AA.2l
AA.3
AA.5
SP.1
SP.16l
SP.17l

Format of records looks like following, but might have more lines than
file names:

GRID_CHECK 102
AA.1l_AA.2l 63
AA.3 29999
AA.5 116
AA.12 10177
AA.15a 93
AA.16a 1
AA.17 868
AA.19 2
PWH.2 7
PWH.3 100
RLHVP.1 409
ONODMY.11 112032
GT.5l 8258
PLL.9 931
PLL.12 5779
SP.1 41
SP.2 2
SP.9 1246
SP.11 5493
SP.16l 4920
SP.17l 20244
SAB.1 27
SAB.3 8

I need to find the corresponding number of each name in the file
records. So I did the following:

# awk 'NR==1' names
AA.12
# awk '/AA.12/' records
AA.12 10177
# awk 'NR==2' names
AA.15a
# awk '/AA.15a/' records
AA.15a 93
# awk 'NR==3' names
AA.16a
# awk '/AA.16a/' records
AA.16a 1
... ...
until the last line of file names.

As shown above, for each line in names, I need two awk commands to get
the result. The matching pattern of the second command is the result
of the first command. So I can only do it manually one by one. If the
2 awk commands can be joined into one command without knowing the
output of the first command, then I can write it easily as a loop in a
shell program.

Is it possible to join the 2 awk commands into one? How to do it?

Thanks.

-------------------------

Answers:

I suppose you want something like one of these...
# Just print the "corresponding number"...
awk 'NR==FNR {a[$1];next} $1 in a {print $2}' names records
# Print the whole line (as in your example)...
awk 'NR==FNR {a[$1];next} $1 in a' names records
Janis

Oops:
awk 'NR==FNR{names[$0];next} $1 in names' names records

--

>Oops:
>awk 'NR==FNR{names[$0];next} $1 in names' names records

That's fine, but it is a little more coherent to write it as:
NR==FNR{names[$1];next}
$1 in names

This makes it more clear that you are linking on the first field in
each file.

--

> awk 'NR==FNR{names[$0];next} $1 in names' names records


The standard syntax is:
awk 'NR==FNR{names[$0];next}; $1 in names' names records

Or:
awk 'NR==FNR{names[$0];next}
$1 in names' names records

Stéphane

--

What standard? That just introduces a redundant semi-colon.
Ed

--

I was as well astonished about that statement. The relevant
syntax defined by POSIX says...

item_list : newline_opt
| actionless_item_list item terminator
| item_list item terminator
| item_list action terminator
;

where 'terminator' can be a sequence of semicolons/newlines.

And I share your opinion about the redundancy; especially
since the only case where a semicolon seems necessary to
disabiguate syntax problems is in case where there's an
action part missing but another contition/action part
following on the same line. But even POSIX distinguishes
this special case as can be seen by the actionless_item_list
branch in the above specification. So, why a terminator?

I compare it with a C-language construct like
if (a==1) { x=42; } else { y=43 };
if (a==1) x=42; else { y=43 };
if (a==1) ; else { y=43 };
where the semicolon is an empty-statement in the latter case
but not required if a statement (awk: action) is required. In
awk as well I always considered the necessary ';' as an empty
(i.e. default) action.

Janis

-------------------------

My result by executing the command directly in the shell:

# awk 'NR==FNR {names[$1];next} $1 in names {print $2}' names records
awk: syntax error near line 1
awk: bailing out near line 1
# awk 'NR==FNR {names[$1];next} $1 in names' names records
awk: syntax error near line 1
awk: bailing out near line 1
# awk 'NR==FNR{names[$0];next} $1 in names' names records
awk: syntax error near line 1
awk: bailing out near line 1
# awk 'NR==FNR{names[$0];next}; $1 in names' names records
awk: syntax error near line 1
awk: bailing out near line 1
# awk 'NR==FNR{names[$0];next}
Unmatched '.
# $1 in names' names records
Unmatched '.
#

----

My result by running the script join_awk with only one command (awk)
and with the first line of #!/usr/bin/tcsh

--

#!/usr/bin/tcsh
awk 'NR==FNR {names[$1];next} $1 in names {print $2}' names records

# join_awk
awk: syntax error near line 1
awk: bailing out near line 1

--

#!/usr/bin/tcsh
awk 'NR==FNR {names[$1];next} $1 in names' names records

# join_awk
awk: syntax error near line 1
awk: bailing out near line 1

--

#!/usr/bin/tcsh
awk 'NR==FNR{names[$0];next} $1 in names' names records

# join_awk
awk: syntax error near line 1
awk: bailing out near line 1

--

#!/usr/bin/tcsh
awk 'NR==FNR{names[$0];next}; $1 in names' names records

# join_awk
awk: syntax error near line 1
awk: bailing out near line 1

--

#!/usr/bin/tcsh
awk 'NR==FNR{names[$0];next}
$1 in names' names records

# join_awk
Unmatched '.

--------

What's the matter in my command or script? What's the correct way to
do it?

Thanks.
Reply With Quote
  #2  
Old 08-11-2008, 09:30 AM
Database Bot
 
Join Date: Sep 2009
Posts: 1,236,254
Database Administrator is on a distinguished road
Default Re: Re-post: How to join 2 awk commands into one, using the outputof the first as the matching pattern of the second?

On 8/11/2008 8:25 AM, Kuhl wrote:
> Hi, all:
>
> I posted this question several days ago. Thank you all who have
> answered the question. But it's still not working in my environment
> after trying the answers.


In what way is it "not working"? Post what command you ran along with the input
files you ran it on and the output you got and explain why that's not what you
expected.

Ed.

Reply With Quote
  #3  
Old 08-11-2008, 09:30 AM
Database Bot
 
Join Date: Sep 2009
Posts: 1,236,254
Database Administrator is on a distinguished road
Default Re: Re-post: How to join 2 awk commands into one, using the outputof the first as the matching pattern of the second?

Kuhl wrote:
> Hi, all:
>
> I posted this question several days ago. Thank you all who have
> answered the question. But it's still not working in my environment
> after trying the answers. So I re-post it now, in case the original
> topic might be too de-prioritized to be seen on the page.

[snip most of repost]
>
> As shown above, for each line in names, I need two awk commands to get
> the result. The matching pattern of the second command is the result
> of the first command. So I can only do it manually one by one. If the
> 2 awk commands can be joined into one command without knowing the
> output of the first command, then I can write it easily as a loop in a
> shell program.
>
> Is it possible to join the 2 awk commands into one? How to do it?
>
> Thanks.
>
> -------------------------
>
> Answers:

[snip huge list of anwsers]

> What's the matter in my command or script? What's the correct way to
> do it?
>
> Thanks.


Please tell us, what problems you have with the suggested solutions.

Janis
Reply With Quote
  #4  
Old 08-12-2008, 09:18 AM
Database Bot
 
Join Date: Sep 2009
Posts: 1,236,254
Database Administrator is on a distinguished road
Default Re: Re-post: How to join 2 awk commands into one, using the output ofthe first as the matching pattern of the second?

On Aug 11, 3:25 pm, Kuhl wrote:

> I posted this question several days ago. Thank you all who
> have answered the question. But it's still not working in my
> environment after trying the answers. So I re-post it now, in
> case the original topic might be too de-prioritized to be seen
> on the page.


> I have 2 files now: names records


> Format of names looks like:


> AA.12
> AA.15a
> AA.16a
> AA.17
> AA.19
> AA.1l_AA.2l
> AA.3
> AA.5
> SP.1
> SP.16l
> SP.17l


> Format of records looks like following, but might have more lines than
> file names:


> GRID_CHECK 102
> AA.1l_AA.2l 63
> AA.3 29999
> AA.5 116
> AA.12 10177
> AA.15a 93
> AA.16a 1
> AA.17 868
> AA.19 2
> PWH.2 7
> PWH.3 100
> RLHVP.1 409
> ONODMY.11 112032
> GT.5l 8258
> PLL.9 931
> PLL.12 5779
> SP.1 41
> SP.2 2
> SP.9 1246
> SP.11 5493
> SP.16l 4920
> SP.17l 20244
> SAB.1 27
> SAB.3 8


> I need to find the corresponding number of each name in the file
> records. So I did the following:


> # awk 'NR==1' names
> AA.12
> # awk '/AA.12/' records
> AA.12 10177
> # awk 'NR==2' names
> AA.15a
> # awk '/AA.15a/' records
> AA.15a 93
> # awk 'NR==3' names
> AA.16a
> # awk '/AA.16a/' records
> AA.16a 1
> ... ...
> until the last line of file names.


> As shown above, for each line in names, I need two awk
> commands to get the result. The matching pattern of the second
> command is the result of the first command. So I can only do
> it manually one by one. If the 2 awk commands can be joined
> into one command without knowing the output of the first
> command, then I can write it easily as a loop in a shell
> program.


> -------------------------


> Answers:

[...]
> -------------------------
>
> My result by executing the command directly in the shell:
>
> # awk 'NR==FNR {names[$1];next} $1 in names {print $2}' names records
> awk: syntax error near line 1
> awk: bailing out near line 1


What version of AWK are you using? This is conform with Posix,
but some systems aren't Posix compatible by default; Solaris,
for example, gives you a very outdated AWK (which doesn't
understand FNR, for example) by default. (If you're under
Solaris, or for that matter, in general, you should use "getconf
PATH" to set your path. If you do this, you'll get the good
AWK.)

If you really want to have it work with all AWK, then use
FILENAME:

awk 'FILENAME = "names" { names[ $1 ]; next }
$1 in names { print $2 }' names records

(I'd also insert the ';' between the items, just in case. Or
just write the command on two lines, as above.)

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Reply With Quote
  #5  
Old 08-12-2008, 09:41 AM
Database Bot
 
Join Date: Sep 2009
Posts: 1,236,254
Database Administrator is on a distinguished road
Default Re: Re-post: How to join 2 awk commands into one, using the outputof the first as the matching pattern of the second?

On 8/12/2008 8:18 AM, James Kanze wrote:

> If you really want to have it work with all AWK, then use
> FILENAME:
>
> awk 'FILENAME = "names" { names[ $1 ]; next }


ITYM:

awk 'FILENAME == "names" ...

i.e. double equals. Also, to avoid hard-coding the file name, you should use:

awk 'FILENAME == ARGV[1] ...

I've no idea if that'll work in old, broken awk but it'll work in any modern awk
and no-one should use old, broken awk for many reasons so if it doesn't work in
the awk you're using, get a modern awk rather than modifying your code.

It's not a bad idea to do the "test for first file" comparison as above instead of

awk 'NR == FNR ...

as the latter won't work as expected if the first file is empty (which isvery
rarerly the case which is why the latter form is the most commonly used).

> $1 in names { print $2 }' names records
>
> (I'd also insert the ';' between the items, just in case.


The command is fine as-is without spurious semi-colons being added.

Ed.

> Or just write the command on two lines, as above.)
>
> --
> James Kanze (GABI Software) email:james.kanze@gmail.com
> Conseils en informatique orientée objet/
> Beratung in objektorientierter Datenverarbeitung
> 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


Reply With Quote
  #6  
Old 08-13-2008, 07:26 AM
Database Bot
 
Join Date: Sep 2009
Posts: 1,236,254
Database Administrator is on a distinguished road
Default Re: Re-post: How to join 2 awk commands into one, using the output ofthe first as the matching pattern of the second?

On Aug 12, 3:41 pm, Ed Morton wrote:
> On 8/12/2008 8:18 AM, James Kanze wrote:
>


> > If you really want to have it work with all AWK, then use
> > FILENAME:


> > awk 'FILENAME = "names" { names[ $1 ]; next }


> ITYM:


> awk 'FILENAME == "names" ...


> i.e. double equals.


Obviously:-(.

> Also, to avoid hard-coding the file name, you should use:


> awk 'FILENAME == ARGV[1] ...


That's a good idea.

> I've no idea if that'll work in old, broken awk but it'll work
> in any modern awk and no-one should use old, broken awk for
> many reasons so if it doesn't work in the awk you're using,
> get a modern awk rather than modifying your code.


I agree, but it's not always that easy. If you're writing an
executable awk program, for example, what do you put on the
first line, other than:
#! /usr/bin/awk
? And that gets you the old, broken AWK under Solaris. Or if
you want to invoke awk in a shell script, and you don't know
what path the user will have set, or what machine he's running
on. If I don't really need the new features (simple jobs in a
shell script, for example), I'll generally try to be compatible
with the older AWKs. Otherwise... I've got a small shell script
which tries out a sequence of likely candidates, and uses the
first one which seems to work. (It's designed so that you can
install a link to it with the actual name; it basically invokes
"$AWK -f $0.awk $*".)

> It's not a bad idea to do the "test for first file" comparison
> as above instead of


> awk 'NR == FNR ...


> as the latter won't work as expected if the first file is
> empty (which is very rarerly the case which is why the latter
> form is the most commonly used).


> > $1 in names { print $2 }' names records


> > (I'd also insert the ';' between the items, just in case.


> The command is fine as-is without spurious semi-colons being
> added.


With any modern AWK. I'm less sure about older AWK's. And at
any rate, the ';' makes the code more readable for human
readers.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Reply With Quote
  #7  
Old 08-16-2008, 08:19 AM
Database Bot
 
Join Date: Sep 2009
Posts: 1,236,254
Database Administrator is on a distinguished road
Default Re: Re-post: How to join 2 awk commands into one, using the outputof the first as the matching pattern of the second?

On 8/13/2008 6:26 AM, James Kanze wrote:

>>I've no idea if that'll work in old, broken awk but it'll work
>>in any modern awk and no-one should use old, broken awk for
>>many reasons so if it doesn't work in the awk you're using,
>>get a modern awk rather than modifying your code.

>
>
> I agree, but it's not always that easy. If you're writing an
> executable awk program, for example, what do you put on the
> first line, other than:
> #! /usr/bin/awk


I never write my scripts that way so I don't know exactly what to suggest. Maybe
#!awk? I always just invoke awk from the shell script, so I could do whatever I
like up front to find a modern awk, but in reality I just modify the PATHto
prepend the directories that typically contain modern awks, e.g.:

PATH=/usr/xpg4/bin:$PATH
awk '...'

> ? And that gets you the old, broken AWK under Solaris. Or if
> you want to invoke awk in a shell script, and you don't know
> what path the user will have set, or what machine he's running
> on.


If you're worried about that, have your awk test for being old, broken awk and
if it is print an error message and exit.

> If I don't really need the new features (simple jobs in a
> shell script, for example), I'll generally try to be compatible
> with the older AWKs.


It's not the lack of new features that's the only problem, it's also the
breakage in old features. For example, the "sprintf()" function in old, broken
awk thinks the ", 3" below are it's arguments rather than arguments for "print":

$ awk 'BEGIN{print sprintf("The magic number is <%d>"), 3; exit}'
The magic number is <3>
$ gawk 'BEGIN{print sprintf("The magic number is <%d>"), 3; exit}'
gawk: fatal: not enough arguments to satisfy format string
`The magic number is <%d>'
^ ran out for this one

$ awk 'BEGIN{print sprintf("The magic number is <%d>", 4), 3; exit}'
The magic number is <4>
$ gawk 'BEGIN{print sprintf("The magic number is <%d>", 4), 3; exit}'
The magic number is <4> 3

> Otherwise... I've got a small shell script
> which tries out a sequence of likely candidates, and uses the
> first one which seems to work. (It's designed so that you can
> install a link to it with the actual name; it basically invokes
> "$AWK -f $0.awk $*".)


OK, but I think all you really need to do is stick /usr/xpg4/bin at the front of
the PATH as, as far as I can tell, only Solaris has old, broken awk in /usr/bin.

>
>>It's not a bad idea to do the "test for first file" comparison
>>as above instead of

>
>
>> awk 'NR == FNR ...

>
>
>>as the latter won't work as expected if the first file is
>>empty (which is very rarerly the case which is why the latter
>>form is the most commonly used).

>
>
>>> $1 in names { print $2 }' names records

>>

>
>>>(I'd also insert the ';' between the items, just in case.

>>

>
>>The command is fine as-is without spurious semi-colons being
>>added.

>
>
> With any modern AWK. I'm less sure about older AWK's. And at
> any rate, the ';' makes the code more readable for human
> readers.


I guess we'll have to agree to disagree on that one. It just makes me take a
second pass of the script to try to figure out what the semi-colon's doing there.

Ed.

> --
> James Kanze (GABI Software) email:james.kanze@gmail.com
> Conseils en informatique orientée objet/
> Beratung in objektorientierter Datenverarbeitung
> 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34



Reply With Quote
  #8  
Old 08-17-2008, 03:28 AM
Database Bot
 
Join Date: Sep 2009
Posts: 1,236,254
Database Administrator is on a distinguished road
Default Re: Re-post: How to join 2 awk commands into one, using the output ofthe first as the matching pattern of the second?

On Aug 16, 2:19 pm, Ed Morton wrote:
> On 8/13/2008 6:26 AM, James Kanze wrote:
>


> >>I've no idea if that'll work in old, broken awk but it'll work
> >>in any modern awk and no-one should use old, broken awk for
> >>many reasons so if it doesn't work in the awk you're using,
> >>get a modern awk rather than modifying your code.


> > I agree, but it's not always that easy. If you're writing an
> > executable awk program, for example, what do you put on the
> > first line, other than:
> > #! /usr/bin/awk


> I never write my scripts that way so I don't know exactly what
> to suggest. Maybe #!awk?


You need an absolute path after the #!. Or maybe that's
needed; it was certainly the case in the past, and I haven't
tried anything else lately. There's also a trick with env, but
it still picks up whatever AWK the user has first in his path.

> I always just invoke awk from the shell script, so I could do
> whatever I like up front to find a modern awk,


That's fine if you're the person invoking it. Some of my
scripts are meant for use by others, and they don't want to know
whether it is AWK or something else.

> but in reality I just modify the PATH to prepend the
> directories that typically contain modern awks, e.g.:


> PATH=/usr/xpg4/bin:$PATH
> awk '...'


Which is fine for Solaris, but doesn't help under Linux.
According to POSIX, you should always start by setting the path
to:
PATH=` getconf PATH `
and add from that. That guarantees a Posix compatible AWK (and
seems to be pretty portable); of course, it supposes that the
original path will reach getconf.

> > ? And that gets you the old, broken AWK under Solaris. Or if
> > you want to invoke awk in a shell script, and you don't know
> > what path the user will have set, or what machine he's running
> > on.


> If you're worried about that, have your awk test for being
> old, broken awk and if it is print an error message and exit.


Oh, it will print an error message, all right. About a syntax
error, because it doesn't understand functions.

> > If I don't really need the new features (simple jobs in a
> > shell script, for example), I'll generally try to be compatible
> > with the older AWKs.


> It's not the lack of new features that's the only problem,
> it's also the breakage in old features. For example, the
> "sprintf()" function in old, broken awk thinks the ", 3" below
> are it's arguments rather than arguments for "print":
>
> $ awk 'BEGIN{print sprintf("The magic number is <%d>"), 3; exit}'
> The magic number is <3>
> $ gawk 'BEGIN{print sprintf("The magic number is <%d>"), 3; exit}'
> gawk: fatal: not enough arguments to satisfy format string
> `The magic number is <%d>'
> ^ ran out for this one


> $ awk 'BEGIN{print sprintf("The magic number is <%d>", 4), 3; exit}'
> The magic number is <4>
> $ gawk 'BEGIN{print sprintf("The magic number is <%d>", 4), 3; exit}'
> The magic number is <4> 3


I know, there are differences in syntax. But it's fairly easy
to write to a least common denominator if you don't need the new
features. (The only problem is remembering to test it. I've
got /usr/xpg4/bin in my path, as well, under Solaris, and of
course the only AWK on Linux is gawk. So by default, I get a
new AWK.)

> > Otherwise... I've got a small shell script
> > which tries out a sequence of likely candidates, and uses the
> > first one which seems to work. (It's designed so that you can
> > install a link to it with the actual name; it basically invokes
> > "$AWK -f $0.awk $*".)


> OK, but I think all you really need to do is stick
> /usr/xpg4/bin at the front of the PATH as, as far as I can
> tell, only Solaris has old, broken awk in /usr/bin.


Except the /usr/xpg4/bin won't work under Linux (and probably
some other OS's as well).

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Reply With Quote
  #9  
Old 08-17-2008, 04:20 AM
Database Bot
 
Join Date: Sep 2009
Posts: 1,236,254
Database Administrator is on a distinguished road
Default Re: Re-post: How to join 2 awk commands into one, using the outputof the first as the matching pattern of the second?

2008-08-17, 00:28(-07), James Kanze:
[...]
>> I never write my scripts that way so I don't know exactly what
>> to suggest. Maybe #!awk?

>
> You need an absolute path after the #!.

[...]

You need a path.

It may be relative, but then that doesn't make a lot of sense.

If you put:

#! awk -f

The script will only work when the current working directory
contains an executable called "awk".

So:

cd /usr/bin; my_script

will work if there's a /usr/bin/awk.

On most systems, you can't do:

#! /usr/bin/env awk -f

because in most #! implementations, only one argument to the
interpreter is allowed (see
http://www.in-ulm.de/~mascheck/various/shebang/ for more info).

You can replace the shebang line with:

"exec" "awk" "-f" "$0" "$@" && 0

Without a shebang (shebangs are not POSIX anyway), a shell
(sh) should be called to interpret the script. For a shell, that
first line tells sh to execute awk with the same file. For awk,
that line is a pattern that is always false so would be ignored.

There is a problem though if your script consists of only one
BEGIN statement. In that case, you'd need to add an "exit" to
the end of the BEGIN action.

--
Stéphane
Reply With Quote
  #10  
Old 08-17-2008, 08:56 AM
Database Bot
 
Join Date: Sep 2009
Posts: 1,236,254
Database Administrator is on a distinguished road
Default Re: Re-post: How to join 2 awk commands into one, using the outputof the first as the matching pattern of the second?

On 8/17/2008 2:28 AM, James Kanze wrote:
> On Aug 16, 2:19 pm, Ed Morton wrote:
>
>>On 8/13/2008 6:26 AM, James Kanze wrote:
>>

>
>
>>>>I've no idea if that'll work in old, broken awk but it'll work
>>>>in any modern awk and no-one should use old, broken awk for
>>>>many reasons so if it doesn't work in the awk you're using,
>>>>get a modern awk rather than modifying your code.
>>>

>
>>>I agree, but it's not always that easy. If you're writing an
>>>executable awk program, for example, what do you put on the
>>>first line, other than:
>>> #! /usr/bin/awk

>>

>
>>I never write my scripts that way so I don't know exactly what
>>to suggest. Maybe #!awk?

>
>
> You need an absolute path after the #!. Or maybe that's
> needed; it was certainly the case in the past, and I haven't
> tried anything else lately. There's also a trick with env, but
> it still picks up whatever AWK the user has first in his path.
>
>
>>I always just invoke awk from the shell script, so I could do
>>whatever I like up front to find a modern awk,

>
>
> That's fine if you're the person invoking it. Some of my
> scripts are meant for use by others, and they don't want to know
> whether it is AWK or something else.


I didn't say I invoke awk from the interactive shell, I said I invoke it from
the shell script, i.e. like this:

#!/bin/sh or whatever
awk 'stuff' "$@"

instead of:

#!/usr/bin/awk
stuff

>
>>but in reality I just modify the PATH to prepend the
>>directories that typically contain modern awks, e.g.:

>
>
>>PATH=/usr/xpg4/bin:$PATH
>>awk '...'

>
>
> Which is fine for Solaris, but doesn't help under Linux.


Right, but Linux doesn't come with old, broken awk so you don't have to worry
about it.

> According to POSIX, you should always start by setting the path
> to:
> PATH=` getconf PATH `
> and add from that. That guarantees a Posix compatible AWK (and
> seems to be pretty portable); of course, it supposes that the
> original path will reach getconf.


I don't think POSIX precludes you from adding specific directories to either
side of your PATH after running getconf.

>
>
>>>? And that gets you the old, broken AWK under Solaris. Or if
>>>you want to invoke awk in a shell script, and you don't know
>>>what path the user will have set, or what machine he's running
>>>on.

>>

>
>>If you're worried about that, have your awk test for being
>>old, broken awk and if it is print an error message and exit.

>
>
> Oh, it will print an error message, all right. About a syntax
> error, because it doesn't understand functions.


If you're catering to old, broken awk then you aren't using functions now
anyway, but if you really want to you could write your shell script as:

awk 'BEGIN{
if (ARGC == 0) {
print "old, broken awk found: fix PATH to...." | "cat>&2"
exit 1
} else {
exit 0
}
}' &&
awk 'real script' "$@"

I contend that just prepending /usr/xpg4/bin to your PATH means you'll
never hit old, broken awk so you'll never need to do that.

>
>>>If I don't really need the new features (simple jobs in a
>>>shell script, for example), I'll generally try to be compatible
>>>with the older AWKs.

>>

>
>>It's not the lack of new features that's the only problem,
>>it's also the breakage in old features. For example, the
>>"sprintf()" function in old, broken awk thinks the ", 3" below
>>are it's arguments rather than arguments for "print":
>>
>>$ awk 'BEGIN{print sprintf("The magic number is <%d>"), 3; exit}'
>>The magic number is <3>
>>$ gawk 'BEGIN{print sprintf("The magic number is <%d>"), 3; exit}'
>>gawk: fatal: not enough arguments to satisfy format string
>> `The magic number is <%d>'
>> ^ ran out for this one

>
>
>>$ awk 'BEGIN{print sprintf("The magic number is <%d>", 4), 3; exit}'
>>The magic number is <4>
>>$ gawk 'BEGIN{print sprintf("The magic number is <%d>", 4), 3; exit}'
>>The magic number is <4> 3

>
>
> I know, there are differences in syntax. But it's fairly easy
> to write to a least common denominator if you don't need the new
> features. (The only problem is remembering to test it. I've
> got /usr/xpg4/bin in my path, as well, under Solaris, and of
> course the only AWK on Linux is gawk. So by default, I get a
> new AWK.)
>
>
>>>Otherwise... I've got a small shell script
>>>which tries out a sequence of likely candidates, and uses the
>>>first one which seems to work. (It's designed so that you can
>>>install a link to it with the actual name; it basically invokes
>>>"$AWK -f $0.awk $*".)

>>

>
>>OK, but I think all you really need to do is stick
>>/usr/xpg4/bin at the front of the PATH as, as far as I can
>>tell, only Solaris has old, broken awk in /usr/bin.

>
>
> Except the /usr/xpg4/bin won't work under Linux (and probably
> some other OS's as well).


Prepending /usr/xpg4/bin to your PATH will work just fine on Linux or anywhere
else - it'll just be a directory that doesn't exist, which your script won't
care about.

Ed.

> --
> James Kanze (GABI Software) email:james.kanze@gmail.com
> Conseils en informatique orientée objet/
> Beratung in objektorientierter Datenverarbeitung
> 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


Reply With Quote
Reply

Thread Tools
Display Modes



All times are GMT -4. The time now is 12:22 PM.