Parsing basename or dirname with shell parameter extension or sed

For review of the basics – demonstrates several ways to easily parse path filename strings. Keep in mind that regex expressions are greedy.
Given that MYVAR=/this/and/that.ext
Here is an example:
echo this/is/a/test | sed -e ‘s/.*\///’ # greedy
result: test
vs
echo this/is/a/test | sed -e ‘s/[^\/]*\///’ # non-greedy
result: is/a/test

To extract the basename:
Using shell parameter expansion…ie: ${MYVAR##*/}
result: that.ext
Note: see man bash and search for “##” or “Parameter Expansion”
or using sed: ie $MYVAR | sed -e ‘s/.*///’
result: that.ext
or using sed with a slightly easier to read delimiter: $MYVAR | sed -e ‘s|.*/||’
result: that.ext
or using basename: basename $MYVAR
result: that.ext

To extract the dirname:
Using shell parameter expansion… ie: ${MYVAR%/*}
result: /this/and
or using sed: ie $MYVAR | sed -e ‘s/[^/]*$//’
/this/and/
NOTE: the dollar sign anchor is needed.
or using sed with a slightly easier to read delimiter: $MYVAR | sed -e ‘s|[^/]*||’
result: /this/and/
NOTE: the dollar sign anchor is needed.
or using: dirname /this/and/that.ext
result: /this/and
=======
To strip off the extension
Using shell parameter expansion… ie: ${MYVAR%.*}
result: /this/and/that
or using sed: ie $MYVAR | sed -e ‘s/.[^.]*$//’
/this/and/that
NOTE: the dollar sign anchor is needed.
or using sed with a slightly easier to read delimiter: $MYVAR | sed -e ‘s|.[^.]*||’
result: /this/and/that
NOTE: the dollar sign anchor is needed.
=======
To show just the filename without the extension
or using: basename /this/and/that.ext ext
result: that

How to see lines that exceed the vi buffer

I ran into this problem again today. While analyzing a huge file I needed to see the contents of lines that were well past the vi buffer. Yes, I could have tweaked the vi settings and gained a greater buffer but I was in a hurry.
I had done some analysis using (egrep/sort/uniq – see the details below) that discovered that the line I needed to see was 2489847 but the vi buffer was not getting past 100000.
I fell back to an old trick to see the lines of and around the desired line (2489847).

 head  -2489850 default_export.xml |tail -10

This listed lines 2489850 – through 2489860 and allowed me to see what I needed.

Just for those who might be interested I needed to find out how many times a parent id was called and then go find out what name the parent id belonged to in an huge xml dump where you might see


<id>12345</id>
<parent>23456</parent>
....

Here is the first step – get a count of each parent id reference:

egrep -n "<parent>[0-9]*</parent>" default_export.xml \
|sort -k2,2|uniq -f 1  -c|sort -n |tail -5

This gave me this out put: (column1=count,column2=one of the line numbers where the parent id occurred, column3=parent id)

684 488774:    <parent>21426</parent>
 848 387747:    <parent>15243</parent>
 935 108754:    <parent>1</parent>
1874 2503542:        <parent>1</parent>
3223 2489855:    <parent>25895</parent>

So now I know that the parent id 25895 occurred the greatest number of times – but I need to see who the id belongs to – I need to find where the unique id=25895 is – so

 egrep -n "25895" default_export.xml

Which gives the line number:
2489847

That is why I needed to see the lines around 249847.

Java VM thread dumps

I have a love hate relationship with Java. It is important to understand that my inclinations are focused on system admin engineering and not on development. It seems everywhere I work java plays a major role. There are lots of talented developers who write great java code but java is not the answer to all needs. There was one occasion a developer had written a java tool to build a report. It took hours to run and sometimes would not finish in time to be useful. So I re-wrote it in awk, sed, and grep – it ran in 1/2 hour.
Java runs in a VM – which has limited visibility, especially for system calls and the like. There are tools that can help, jvisualvm, jmap, jstats, JTop, etc. and of course the application logs. Typically the logs are semi-controllable through property files (most developers use log4j) but the bottom line is that the coders are ultimately in control of what gets written into the logs. There is no trace or strace so you must rely on the jdk tools above to trouble shoot the java VM.
In that line of thinking I use a script to dump threads, get jmap histos and prstat output if available.
The jmap histos can help the developers gain some insight into memory usage and possible identify memory leaks.
The thread dumps are usefull is seeing locked threads which may be holding up other threads.
Sometimes performance issues (on solaris) can be analyzed by using the busiest prstat thread PID and comparing it with the thread identified in the thread dumps. One issue with this is thatthe prstat output the threads as decimals and thread dumps output them as hex numbers so you have to convert one of the other to find its equivalent.
This script adds a column to prstat output that converts the decimal PID into hex to facilitate analysis.
Here is the script – modify as you need – the string to search for in the ps listing needs to be changed for uniquely identifying your java VM.

#!/bin/bash
# script: tdumps.sh
# quick-n-dirty to dump threads and histos
# 20110131 added prstat out to an prstat.out file with time stamps
# redirect outout to a file if desired
# modify for your use.
# 20110413 made this a bit more generic and added additional comments
# also added the HEX column in the prstat output

TDATE=`date +%Y%m%d-%H%M`
AWK_CMD=`type -p nawk`

TDUMP_CMD=" kill -3 "
#  and or
JHISTO_CMD=" jmap -histo "

VM_STRING="UNIQUE_STRING_HERE" #string to search for - change this as needed
#echo DEBUG: VM_STRING=$VM_STRING

VM_PID=`ps -ef | grep $VM_STRING | grep -v grep | awk '{print $2}'`
#echo DEBUG: VM_PID=$VM_PID

ITERATIONS=6
#echo DEBUG: ITERATIONS=$ITERATIONS
SLEEP=10

i=0
while [ $i -lt $ITERATIONS ]; do
  echo ==== count=$i `date +%Y%m%d-%H%M` ====
  #echo DEBUG: Now running: $TDUMP_CMD $VM_PID
  # uncomment the next line once you are SURE it is going to do what you want
  # this command assumes the thread dump output will go into a log
  # you may have to redirect this if the output goes to the console 
  $TDUMP_CMD $VM_PID
  #echo DEBUG: Now running: $JHISTO_CMD $VM_PID
  # uncomment the next lines once you are SURE it is going to do what you want
  echo `date +%Y%m%d-%H%M`>>histos-$TDATE.out
  # this will only grab the top 65 histos and that is probably what you want
  $JHISTO_CMD $VM_PID | head -65  >>histos-$TDATE.out
  echo "We are only capturing the top 65 rows of the histo output on each run">>histos-$TDATE.out
  # or you can run the full histo dump - but this is a lot of output
  #$JHISTO_CMD $VM_PID  >>histos-$TDATE.out

  # this only runs if we are on a Sun OS platform - but it can be very helpful in analysis
  # as you can get the busies thread from the prstats
  # convert the thread number from decimal to hex and find it in the 
  # thread dumps - that can tell you what VM thread may be a performance hog
  if [ `uname -s | grep Sun` ] ; then
     # only works in Solaris
     # this next line is needed to throw the timestamp into the output file
     echo `date +%Y%m%d-%H%M`>>prstat-$TDATE.out
     # this should just grab the top 20 and adds a column that translated the thread PID to HEX
     echo "We are only capturing the top 20 rows of the prstat output">>prstat-$TDATE.out
     prstat -Lm 1 1 | head -20|$AWK_CMD '{if($1~/PID/){printf "HEX";}else{printf "%X",$1}print $0}'>>prstat-$TDATE.out
     # or you can use this next one which does not add the HEX column and gets all the threads
     #prstat -Lm 1 1  >> prstat-$TDATE.out
  fi
  echo Sleeping for $SLEEP seconds...
  sleep $SLEEP
  ((i++)) # or i=$( i + 1 )
done

*nix shell profiles – order is important!

Often I hear the question “How should I update my profile?” and then that usually leads to what file should I use.
It is important to understand the file loading order and why this adds flexibility.
My comments here will focus on bash but most/all shells have similar loading features.
When your bored one rainly day do a “man bash”. Be forewarned, this will bring up one of the largest man pages on the system. Specifically, to find information on bash loading, search for the section heading called INVOCATION.
The order here is important
The bash shell on login first reads and executes commands from the file (~=your home dir)
/etc/profile, if that file exists. <- this affects *all* users on the system
After reading that file, it looks for
~/.bash_profile, <-this affects only this user
~/.bash_login, and <- only affects this user
~/.profile, <-- this one is honored by many command shells, a kind of universal source file and reads and executes commands from the first one that exists and is readable.

Many distros will include a source file request within the ~.bash_profile similar to this:

if [ -f ~/.bashrc ]; then
  source ~/.bashrc # use the word “source” not the shortcut “.” 
                          # as older bash shells will not know the shortcut.
fi

I even include another one at the end of my .bashrc to look for an include file called ~/.bash_local - for local commands related to that server. For example:

# --- add this at the end of ~/.bashrc ---#
if [ -f $MYHOME/.bash_local ]; then
  source $MYHOME/.bash_local
else
  echo 'PATH=$PATH:$HOME/dev/utils'>>$MYHOME/.bash_local
  source $MYHOME/.bash_local
fi
#---- ~/.bash_local contents ----#
PATH=$PATH:$HOME/dev/utils
alias fin="cd ~/dev/fin"
alias www="cd /var/www/"

Enjoy,
Geoff

Use awk to grab the remaining X fields.

I needed to grab all the fields (command line args) from the “history” command output. I started to do what I have always done in the past (assuming you need fields 4 through the end).

history | awk -v col=4 '{ for (x=col; x< =NF; x++) {
printf $x " "; }; print " " }'

But I found an easier way:

history | awk '{$1=$2=$3="";print $0}'

or

history | awk '{$1=$2=$3=""}1'

Here is a practical use of this:

history | awk '{$1=$2=$3="";print $0}' | sort | uniq -c | sort -n # frequency of command use

I constantly use awk, grep, and sed to work out issues. My philosophy in using awk, grep, and sed vs Perl is two fold. These fundamental tools are universally found on *nix servers – even ones that have severe security requirements (think STIG here). The second reason is grounds for a flame war. I believe scripts using these tools are easier to maintain by a team than Perl. Perl has so many ways of accomplishing a task that the code becomes very personalized and styled to the coder’s preferences. It be comes a “child” that only the father can take care of. But I digress…
Enjoy!

A simple shell calculator

As a System Engineer the command line is where I live most of the time. So having a quick means of calculating is needed.

The tool “bc” (think: binary calculator) is your friend here but you must be aware of the -l option. I often use one of two methods. For a series of calculations I will use “bc -l”.  The -l option will include the standard math libraries (my reasoning here is an over simplification: it allows floating point results).

I have an alias in my (extensive) .bashrc script that looks like this:

alias bcl="bc -l"

This allows me to quickly establish a live calculator by typing “bcl” and when I am finished hit Ctrl-D

The second method still uses “bc” but is a function in my .bashrc script called calcit. Note that within the comments I provide a way to do this in perl as well.

########
calcit() # command line calculator
########
{
# if no argument provided (we expect at least one)
if [ x$1 = "x" ] ; then
  echo Examples: calcit -s 2 \"13000 / 530000\"
  echo "   the -s 2 means set the scale to 2 decimal places"
  echo or
  echo calcit \"13000 / 530000\"
fi

# if the first argument is "-s" then grab the next arg as the desired scale 
# i.e.: places past the decimal point
if [ "x$1" = "x-s" ]; then
  SCALE=$2
  shift 2
fi
SCALE=${SCALE:-1}

# now pump in the scale and the rest of the argument(s) (which we assume the is the calculation)
# into bc (-q for quiet [no welcome banner] and -l for math libraries)
bc -ql <
I can easily run a quick calculation from the command line with something like this:
calcit 70*70/105
49.6
Enjoy
-g-

Quick-n-Dirty subversion (svn) using ssh tunnel


apt-get install subversion subversion-tools apache2 libapache2-svn
# Create a new repository
mkdir /data/svnrepos
svnadmin create /data/svnrepos
grpadd svn
#(add the desired users to this group
# - remember they need to restart - even X
# - then
chmod -R g+rw /data/svnrepos
chgrp -R svn /data/svnrepos
# the repository is now ready

#### to import a project
cd to your dev dir eg
cd $HOME/dev
#then
mkdir myproj
cd myproj
mkdir trunk branches tags # trust me add all 3 of these
#edit some files in the trunk dir
cd trunk # $HOME/dev/myproj/trunk
cat "lots of stuff">new.fil

 Continue reading 

Solaris 10 (x86) – with a failure to communicate

I had an interesting issue develop today. I was asked to help with a Solaris 10 system that failed to come up after a reboot, or rather, was unreachable remotely after a reboot. The kernel answered to a ping but ssh failed to respond. Fortunately I was able to string a console cable to a laptop and took a look at what was going on. Listing the services and grepping for ssh
# svcs | grep ssh
showed ssh failed to come online. I tried to restart it without success but no messages about why.
# svcadm restart ssh
Doing an check of dependencies
# svcs -d ssh
and a detailed check on the service
# svcs -xv ssh
showed that the filesystem/local:default service was failing to come up. Hmmmm, doing a df -k seemed ok….

Continue reading