How to see lines that exceed the vi buffer

I ran into this problem again today. While analyzing a huge file I needed to see the contents of lines that were well past the vi buffer. Yes, I could have tweaked the vi settings and gained a greater buffer but I was in a hurry.
I had done some analysis using (egrep/sort/uniq – see the details below) that discovered that the line I needed to see was 2489847 but the vi buffer was not getting past 100000.
I fell back to an old trick to see the lines of and around the desired line (2489847).

 head  -2489850 default_export.xml |tail -10

This listed lines 2489850 – through 2489860 and allowed me to see what I needed.

Just for those who might be interested I needed to find out how many times a parent id was called and then go find out what name the parent id belonged to in an huge xml dump where you might see


<id>12345</id>
<parent>23456</parent>
....

Here is the first step – get a count of each parent id reference:

egrep -n "<parent>[0-9]*</parent>" default_export.xml \
|sort -k2,2|uniq -f 1  -c|sort -n |tail -5

This gave me this out put: (column1=count,column2=one of the line numbers where the parent id occurred, column3=parent id)

684 488774:    <parent>21426</parent>
 848 387747:    <parent>15243</parent>
 935 108754:    <parent>1</parent>
1874 2503542:        <parent>1</parent>
3223 2489855:    <parent>25895</parent>

So now I know that the parent id 25895 occurred the greatest number of times – but I need to see who the id belongs to – I need to find where the unique id=25895 is – so

 egrep -n "25895" default_export.xml

Which gives the line number:
2489847

That is why I needed to see the lines around 249847.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply