You are not logged in.

#1 2009-07-25 05:18:32

ngoonee
Forum Fellow
From: Between Thailand and Singapore
Registered: 2009-03-17
Posts: 7,356

[SOLVED] Quick sed/awk question, concerning splitting birthdays

Hi all, I'm using J-Pilot and Evolution, and am writing a simple script to convert J-Pilot's csv format to Evo's native csv format for easy import, in the absence of an actual sync between both.

Anyway, I've used pretty basic sed->awk->sed to get pretty much everything correct, except for this small matter, J-Pilot saves birthdates as <year>/<month>/<day>, which evo saves the year, month, and day separately. My current script has got to the point where I have:-

<some data, name, address, whatever>*1980/12/19*<rest of the data>

And I'd just like to replace the / in there with *, which would make it function properly. My issue is that I can't use a simple sed to replace /, since within the addresses there would be the same character to indicate specific road numbers. I was wondering whether I could do an sed only on that section. My awk, for reference, is a simple big { print $18"*"$4"*".... } to map input and output values.

Thanks for your help. Please do ask if I'm not clear.

Edit: Solved, thanks arkham.

Last edited by ngoonee (2009-07-25 06:01:43)


Allan-Volunteer on the (topic being discussed) mailn lists. You never get the people who matters attention on the forums.
jasonwryan-Installing Arch is a measure of your literacy. Maintaining Arch is a measure of your diligence. Contributing to Arch is a measure of your competence.
Griemak-Bleeding edge, not bleeding flat. Edge denotes falls will occur from time to time. Bring your own parachute.

Offline

#2 2009-07-25 05:39:33

arkham
Member
From: Stockholm
Registered: 2008-10-26
Posts: 516
Website

Re: [SOLVED] Quick sed/awk question, concerning splitting birthdays

Does this work ?

sed -e 's|^\(<[^>]*>\*[0-9]\{4\}\)/\([0-9]\{2\}\)/\([0-9]\{2\}\*<[^>]*>\)$|\1*\2*\3|'

A small explanation:
- '|' is used as separator (ex. sed -e 's|foo|bar|');
- First ^ means start of the line, last $ means end of the line (ex. sed -e 's/^foo$/spam/');
- <[^>]*> matches any sequence that starts with '<', followed by zero or more characters different from '>', then '>';
- [0-9]\{4\} matches numbers with 4 digits;
- The \( \) construct delimits a region: try sed -e 's/\(.*\) \(.*\) \(.*\)/\1 +++ \2 +++ \3/' <<< "egg sausage spam" to better understand regions.

EDIT: a shorter and more elegant version would be

sed -e '/>/,/</s|/|*|g'

(start this substitution when you meet '>' and stop this substitution when you meet '<')

Last edited by arkham (2009-07-25 06:03:34)


"I'm Winston Wolfe. I solve problems."

~ Need moar games? [arch-games] ~ [aurcheck] AUR haz updates? ~

Offline

#3 2009-07-25 05:59:49

ngoonee
Forum Fellow
From: Between Thailand and Singapore
Registered: 2009-03-17
Posts: 7,356

Re: [SOLVED] Quick sed/awk question, concerning splitting birthdays

Thanks arkham smile. I was actually working on that as well and came up with this:-

sed 's/\([0-9][0-9][0-9][0-9]\)\/\([0-9][0-9]\)\/\([0-9][0-9]\)/\1\*\2\*\3/g'

Mine seems simpler to understand (obviously, I managed to make it after all). I see you included <[^>]*>, that's my mistake as the <some data> actually looks like name*address*blah*blah, and I thought putting it in <> would just indicate that its not REAL text. My apologies, I've learnt something anyhow.

I like the \{4\} however, I think I'll be using that instead. Thanks much.


Allan-Volunteer on the (topic being discussed) mailn lists. You never get the people who matters attention on the forums.
jasonwryan-Installing Arch is a measure of your literacy. Maintaining Arch is a measure of your diligence. Contributing to Arch is a measure of your competence.
Griemak-Bleeding edge, not bleeding flat. Edge denotes falls will occur from time to time. Bring your own parachute.

Offline

#4 2009-07-25 06:13:03

arkham
Member
From: Stockholm
Registered: 2008-10-26
Posts: 516
Website

Re: [SOLVED] Quick sed/awk question, concerning splitting birthdays

ngoonee wrote:

Thanks arkham smile. I was actually working on that as well and came up with this:-

sed 's/\([0-9][0-9][0-9][0-9]\)\/\([0-9][0-9]\)\/\([0-9][0-9]\)/\1\*\2\*\3/g'

Mine seems simpler to understand (obviously, I managed to make it after all). I see you included <[^>]*>, that's my mistake as the <some data> actually looks like name*address*blah*blah, and I thought putting it in <> would just indicate that its not REAL text. My apologies, I've learnt something anyhow.

I like the \{4\} however, I think I'll be using that instead. Thanks much.

LOL, never use metacharacters when asking for regexps! big_smile


"I'm Winston Wolfe. I solve problems."

~ Need moar games? [arch-games] ~ [aurcheck] AUR haz updates? ~

Offline

#5 2009-07-25 06:15:08

ngoonee
Forum Fellow
From: Between Thailand and Singapore
Registered: 2009-03-17
Posts: 7,356

Re: [SOLVED] Quick sed/awk question, concerning splitting birthdays

arkham wrote:
ngoonee wrote:

Thanks arkham smile. I was actually working on that as well and came up with this:-

sed 's/\([0-9][0-9][0-9][0-9]\)\/\([0-9][0-9]\)\/\([0-9][0-9]\)/\1\*\2\*\3/g'

Mine seems simpler to understand (obviously, I managed to make it after all). I see you included <[^>]*>, that's my mistake as the <some data> actually looks like name*address*blah*blah, and I thought putting it in <> would just indicate that its not REAL text. My apologies, I've learnt something anyhow.

I like the \{4\} however, I think I'll be using that instead. Thanks much.

LOL, never use metacharacters when asking for regexps! big_smile

Yes, now I realize smile. Thanks again.


Allan-Volunteer on the (topic being discussed) mailn lists. You never get the people who matters attention on the forums.
jasonwryan-Installing Arch is a measure of your literacy. Maintaining Arch is a measure of your diligence. Contributing to Arch is a measure of your competence.
Griemak-Bleeding edge, not bleeding flat. Edge denotes falls will occur from time to time. Bring your own parachute.

Offline

Board footer

Powered by FluxBB