воскресенье, 18 сентября 2011 г.

Extracting domain name from uri with SQL

I don't know if it is effective, but this ugly SQL (in PostgreSQL dialect) extracts domain name part from uri (i.e. translates protocol://something/page into "something"):

SELECT uri, substring (uri from position('//' in uri)+2 for ( char_length(uri) - char_length(substring (uri from position ('/' in substring ( uri from position('//' in uri)+2 )) ) ))) from proxy_data

It seems, it would look better with regexp:
 
select uri, substring ( substring (uri from '//[[:alnum:]\._\-]+') from 3) from proxy_data

However, in later case we should list all possible symbols in domain name in our regexp.

Комментариев нет:

Отправить комментарий