You are not logged in.
Perhaps the problem in it's entirety is a bit more involved that just regular expressions, but for a start, this small problem is what I need help with.
In short, I have some data which can be in the of some text, an IP address or an URL.
In case that it's URL, it will usually, if not always, be in the form of a base URL, with or without one or more subdomains.
I need to strip away these sub domains so that only the base URL remain, thus:
abc.def.ghi.com
becomes
ghi.com
forgetting for the moment that some top level domains have two parts, it seems so easy, yes I cannot for the life of me figure out how to do it, using only the basic command line tool available such as sed, awk, and so on and so forth, and thus I hope that someone here can and will help me. It is of course also possible that the process is simply to involved that a simple command line will do, in which case I suppose a bahs or python script will have to do, but I really hope not. In any case I will appreciate any help and/or advise you can give me.
Best regards.
Last edited by zacariaz (2015-05-07 13:38:47)
I am a philosopher, of sorts, not a troll or an imbecile.
My apologies that this is not always obvious, despite my best efforts.
Offline
Is this a homework exercise?
What have you tried so far? It should be easy enough to accomplish using a combination of rev and cut, although I'm sure there are more elegant solutions using awk.
Mod note: Moving to Newbie Corner
Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD
Making lemonade from lemons since 2015.
Offline
Is this a homework exercise?
What have you tried so far? It should be easy enough to accomplish using a combination of rev and cut, although I'm sure there are more elegant solutions using awk.
Mod note: Moving to Newbie Corner
I wish it were homework, but no.
The rev idea is interesting, and I'll look into it, but as it is, an awk solution would be preferable, as this is only part of a larger issue, and I'd like to keep it as simple as possible. Also, as for topdomain with two parts, it's still somewhat involved.
As for what I've tried, a lot as I only stopped and went to bed when I realized the sun was raising, but the main problem is that I don't really know how to attack the problem, and of course I'm not exactly a master of regular expressions.
Anyway thanks.
I am a philosopher, of sorts, not a troll or an imbecile.
My apologies that this is not always obvious, despite my best efforts.
Offline
You don't need regular expressions -- just split on dots.
awk -F. '{OFS=".";print $(NF-1), $NF}'
or
awk -F. '{print $(NF-1) "." $NF}'
or
rev | cut -d. -f-2 | rev
Last edited by Raynman (2015-05-07 12:17:29)
Offline
You don't need regular expressions -- just split on dots.
awk -F. '{OFS=".";print $(NF-1), $NF}'
or
awk -F. '{print $(NF-1) "." $NF}'
or
rev | cut -d. -f-2 | rev
Yes thank, I figure out that as well, but only got around to mentioning it now, sorry.
I have also concluded concluded that for now, second level domain issues is too big a mouthful.
I still have big issues with awk, but nothing I can't handle I think, so thanks.
I am a philosopher, of sorts, not a troll or an imbecile.
My apologies that this is not always obvious, despite my best efforts.
Offline