Government Security
Network Security Resources

Jump to content

Photo

Need a regex for RFCs


  • Please log in to reply
5 replies to this topic

#1 meathive

meathive

    Staff Sergeant

  • Sergeant Major
  • 254 posts

Posted 07 December 2008 - 07:08 PM

I'm looking for a regex wiz that can find a way to match the entries in the RFC blocks here.

I need a pattern for sed/grep/perl, any will do, for...
0001 Host Software. S. Crocker. April 1969. (Format: TXT=21088 bytes)
		(Status: UNKNOWN)

0002 Host software. B. Duvall. April 1969. (Format: TXT=17145 bytes)
		(Status: UNKNOWN)

0003 Documentation conventions. S.D. Crocker. April 1969. (Format:
		TXT=2323 bytes) (Obsoleted by RFC0010) (Status: UNKNOWN)

...where each description after the four digit number can be extracted. I've tried removing duplicate newlines, and other white space tricks with tr but no luck so far. If everything were on the same line it would be a snap but I'm unsure how to continue with this multiline pattern. Couldn't find a pattern that worked searching every character up to "\n\n", the next entry.

Any suggestions?
...oO oO oO kinqpinz.info Oo Oo Oo...
---------------------------------------------------------
# angelheaded hipsters
## burning for the ancient heavenly connection
### to the starry dynamo
#### in the machinery of night.

#2 dale22

dale22

    Private First Class

  • Members
  • 23 posts

Posted 07 December 2008 - 09:57 PM

Try this algorithm:

Read first 4 characters and save as rfc#
Read until '(Status:'
then read until ')' and this will be your rfc description
skip the next linefeed

Sorry that I don't know perl much, but the pattern of the rfc's all end with "(Status: xxxx)"

#3 webdevil

webdevil

    Retired GSO General

  • Sergeant Major
  • 1,195 posts

Posted 08 December 2008 - 02:50 AM

I didnt quite understand what you wanted, but since you said

If everything were on the same line it would be a snap

cat filename | sed 's/	 /=/' | sed -e :a -e '$!N;s/\n=/ /;ta' -e 'P;D'

That should bring everything to the same line
;)

#4 meathive

meathive

    Staff Sergeant

  • Sergeant Major
  • 254 posts

Posted 08 December 2008 - 03:24 AM

You're right, I didn't make that clear. Instead of...
0001 Host Software. S. Crocker. April 1969. (Format: TXT=21088 bytes)
		(Status: UNKNOWN)

...I want the entire description on one line (not the Bash code), e.g...
0001 Host Software. S. Crocker. April 1969. (Format: TXT=21088 bytes) (Status: UNKNOWN)

...oO oO oO kinqpinz.info Oo Oo Oo...
---------------------------------------------------------
# angelheaded hipsters
## burning for the ancient heavenly connection
### to the starry dynamo
#### in the machinery of night.

#5 webdevil

webdevil

    Retired GSO General

  • Sergeant Major
  • 1,195 posts

Posted 08 December 2008 - 03:34 AM

My solution should have worked then? didnt it?

#6 meathive

meathive

    Staff Sergeant

  • Sergeant Major
  • 254 posts

Posted 08 December 2008 - 01:53 PM

It did, nicely done. I hadn't tried until now.

Thanks.
...oO oO oO kinqpinz.info Oo Oo Oo...
---------------------------------------------------------
# angelheaded hipsters
## burning for the ancient heavenly connection
### to the starry dynamo
#### in the machinery of night.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users