You are not logged in.

#1 2015-05-07 11:18:33

zacariaz
Member
From: Denmark
Registered: 2012-01-18
Posts: 539

[solved]Issue concerning regular expressions

Perhaps the problem in it's entirety is a bit more involved that just regular expressions, but for a start, this small problem is what I need help with.

In short, I have some data which can be in the of some text, an IP address or an URL.
In case that it's URL, it will usually, if not always, be in the form of a base URL, with or without one or more subdomains.
I need to strip away these sub domains so that only the base URL remain, thus:

abc.def.ghi.com
becomes
ghi.com

forgetting for the moment that some top level domains have two parts, it seems so easy, yes I cannot for the life of me figure out how to do it, using only the basic command line tool available such as sed, awk, and so on and so forth, and thus I hope that someone here can and will help me. It is of course also possible that the process is simply to involved that a simple command line will do, in which case I suppose a bahs or python script will have to do, but I really hope not. In any case I will appreciate any help and/or advise you can give me.


Best regards.

Last edited by zacariaz (2015-05-07 13:38:47)


I am a philosopher, of sorts, not a troll or an imbecile.
My apologies that this is not always obvious, despite my best efforts.

Offline

#2 2015-05-07 11:47:20

WorMzy
Forum Moderator
From: Scotland
Registered: 2010-06-16
Posts: 11,896
Website

Re: [solved]Issue concerning regular expressions

Is this a homework exercise?

What have you tried so far? It should be easy enough to accomplish using a combination of rev and cut, although I'm sure there are more elegant solutions using awk.


Mod note: Moving to Newbie Corner


Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD

Making lemonade from lemons since 2015.

Offline

#3 2015-05-07 12:08:56

zacariaz
Member
From: Denmark
Registered: 2012-01-18
Posts: 539

Re: [solved]Issue concerning regular expressions

WorMzy wrote:

Is this a homework exercise?

What have you tried so far? It should be easy enough to accomplish using a combination of rev and cut, although I'm sure there are more elegant solutions using awk.


Mod note: Moving to Newbie Corner

I wish it were homework, but no.

The rev idea is interesting, and I'll look into it, but as it is, an awk solution would be preferable, as this is only part of a larger issue, and I'd like to keep it as simple as possible. Also, as for topdomain with two parts, it's still somewhat involved.

As for what I've tried, a lot as I only stopped and went to bed when I realized the sun was raising, but the main problem is that I don't really know how to attack the problem, and of course I'm not exactly a master of regular expressions.


Anyway thanks.


I am a philosopher, of sorts, not a troll or an imbecile.
My apologies that this is not always obvious, despite my best efforts.

Offline

#4 2015-05-07 12:14:41

Raynman
Member
Registered: 2011-10-22
Posts: 1,539

Re: [solved]Issue concerning regular expressions

You don't need regular expressions -- just split on dots.

awk -F. '{OFS=".";print $(NF-1), $NF}'

or

awk -F. '{print $(NF-1) "." $NF}'

or

rev | cut -d. -f-2 | rev

Last edited by Raynman (2015-05-07 12:17:29)

Offline

#5 2015-05-07 13:38:30

zacariaz
Member
From: Denmark
Registered: 2012-01-18
Posts: 539

Re: [solved]Issue concerning regular expressions

Raynman wrote:

You don't need regular expressions -- just split on dots.

awk -F. '{OFS=".";print $(NF-1), $NF}'

or

awk -F. '{print $(NF-1) "." $NF}'

or

rev | cut -d. -f-2 | rev

Yes thank, I figure out that as well, but only got around to mentioning it now, sorry.

I have also concluded concluded that for now, second level domain issues is too big a mouthful.

I still have big issues with awk, but nothing I can't handle I think, so thanks.


I am a philosopher, of sorts, not a troll or an imbecile.
My apologies that this is not always obvious, despite my best efforts.

Offline

Board footer

Powered by FluxBB