You are not logged in.

#1 2023-02-20 08:48:58

feinedsquirrel
Member
Registered: 2020-12-11
Posts: 43

PostgreSQL collation warning

I've read through this thread and I am still a little confused at what I exactly need to do.

I have only the en.UTF-8 locale on my machine, and querying the databases ensures all use that locale. I also ran the command in the linked postgres docs, and it returns "(0 rows)":

[postgres@myhost ~]$ psql
WARNING:  database "postgres" has a collation version mismatch
DETAIL:  The database was created using collation version 2.36, but the operating system provides version 2.37.
HINT:  Rebuild all objects in this database that use the default collation and run ALTER DATABASE postgres REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
psql (15.1)
Type "help" for help.


postgres=# \l
                                                  List of databases
    Name    |  Owner   | Encoding |   Collate   |    Ctype    | ICU Locale | Locale Provider |   Access privileges   
------------+----------+----------+-------------+-------------+------------+-----------------+-----------------------
 myapp_db   | myapp    | UTF8     | en_US.UTF-8 | en_US.UTF-8 |            | libc            | 
 postgres   | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |            | libc            | 
 template0  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |            | libc            | =c/postgres          +
            |          |          |             |             |            |                 | postgres=CTc/postgres
 template1  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |            | libc            | =c/postgres          +
            |          |          |             |             |            |                 | postgres=CTc/postgres
(4 rows)

postgres=# SELECT pg_describe_object(refclassid, refobjid, refobjsubid) AS "Collation",
postgres-# pg_describe_object(classid, objid, objsubid) AS "Object"
postgres-# FROM pg_depend d JOIN pg_collation c
postgres-# ON refclassid = 'pg_collation'::regclass AND refobjid = c.oid
postgres-# WHERE c.collversion <> pg_collation_actual_version(c.oid)
postgres-# ORDER BY 1, 2;
 Collation | Object 
-----------+--------
(0 rows)

One point in particular confuses me from the other thread:

wolegis wrote:

as I had observed that for these collations the version was still 2.36

I've found how to get the collation type (e.g. UTF-8) but not the version number. How could I observe this, besides from the "DETAIL" printed out with the "WARNING"?

Also, user "wolegis" ran

...
alter collation pg_catalog."en_US.utf8" refresh version;
alter collation pg_catalog."en_US" refresh version;

then after he ran

alter database xxxx refresh collation version;

loqs ended with

loqs wrote:

... You would then only need to run ALTER DATABASE.

Primary question:
Since I only have UTF-8, and no un-used collations, does this mean I only need to "alter databse...collation version;" and I do not need to "alter collation...refresh version;"?

From the doc page, I think the answer to the last question is "correct", because it says

https://www.postgresql.org/docs/current/sql-altercollation.html#SQL-ALTERCOLLATION-NOTES wrote:

For the database default collation, there is an analogous command ALTER DATABASE ... REFRESH COLLATION VERSION

and my database default is utf-8. I guess I am looking for confirmation before I go running commands on my database.

Final question:
If the query had returned a non-zero (rows?) then would the correct process simply involve running REINDEX (I've just read the doc page for it) followed by "ALTER DATABASE..."

Thanks in advance

_________________________
edit:

I decided to move forward, to see what I could do, after taking a backup of the database cluster.

The docs page for managing templates says:

https://www.postgresql.org/docs/current/manage-ag-templatedbs.html wrote:

This is because template1 might contain encoding-specific or locale-specific data, while template0 is known not to

I found that after attempting to

postgres=# ALTER DATABASE template0 REFRESH COLLATION VERSION;
ERROR:  invalid collation version change

Does this mean that template0 doesn't have a collation version at all? That same doc page mentions using template0 for pg_dump restorations.
Part of my question is if I restored a pg_dump backup of a database that has collation from glibc 2.37, would issues arise restoring that dump on top of a template0 that was created while glibc 2.36 was in use?
My suspicion is that no issues would arise, but I would just like to be sure, moving forward, as I was able to update the collation version of "template1" and "postgres" successfully.

Last edited by feinedsquirrel (2023-02-20 22:04:05)

Offline

#2 2023-08-04 02:57:19

drankinatty
Member
From: Nacogdoches, Texas
Registered: 2009-04-24
Posts: 91
Website

Re: PostgreSQL collation warning

I ran into this today on postgressql update where the ICU collation version went from 2.37 to 2.38. I found Collation version mismatch - postgresql - DBA Stack Exchange which advised to reindex each 'database' then alter the database and refresh the version number. For example for my 'code' database I needed to do:

    reindex database code;
    alter database code refresh collation version;

That solved the issue. I then had to repeat for each database. A better option may be to loop over each database from your shell and do the same thing to all databases (excluding those that don't require update) to accomplish the update for all at once. You could query to get the list of all databases with something similar to

$ psql -d <anydb> -tAc 'SELECT datname FROM pg_database WHERE datallowconn = true';

Where you replace 'anydb' with any database to initially connect to (you can omit it if you have a database named the same as your Linux username)

Last edited by drankinatty (2023-08-04 04:09:29)


David C. Rankin, J.D.,P.E.

Offline

#3 2023-08-04 19:28:52

knotty66
Member
Registered: 2013-10-22
Posts: 1

Re: PostgreSQL collation warning

Thanks! Came in really handy, after GPT-4 told me some massive laborious process I needed to follow to upgrade...

I used this script, inspired by your comment, running it as the postgres user.

#!/bin/bash

# Get a list of all databases
DB_LIST=$(psql -Atc "SELECT datname FROM pg_database where datallowconn = true;")

for db in $DB_LIST; do
    echo "Processing $db"
    
    # Reindex the database
    psql -c "REINDEX DATABASE $db;" $db
    
    # Refresh the collation version
    psql -c "ALTER DATABASE $db REFRESH COLLATION VERSION;" $db
    
    echo "$db done."
done

Offline

#4 2023-08-09 10:18:36

nl6720
The Evil Wiki Admin
Registered: 2016-07-02
Posts: 710

Re: PostgreSQL collation warning

If you have a working solution, it would be useful to add it to https://wiki.archlinux.org/title/Postgr … leshooting.

Offline

#5 2023-08-30 09:29:15

amish
Member
Registered: 2014-05-10
Posts: 507

Re: PostgreSQL collation warning

knotty66 wrote:

I used this script, inspired by your comment, running it as the postgres user.

...
    # Reindex the database
    psql -c "REINDEX DATABASE $db;" $db
    
...

I think for safety we should also add:

...
    # Reindex the system catalogs
    psql -c "REINDEX SYSTEM $db;" $db
...

Because from the documentation, it is not clear if REINDEX DATABASE, re-indexes SYSTEM catalogs as well or not.

So above will make sure that everything in DATABASE as well as SYSTEM catalogs is reindexed with new collation.

Last edited by amish (2023-08-30 09:32:57)

Offline

#6 2023-08-30 10:00:16

amish
Member
Registered: 2014-05-10
Posts: 507

Re: PostgreSQL collation warning

I have squeezed down @knotty66's script, to use single SQL connection per database instead of three.

#!/bin/bash
DB_LIST=$(psql -Atc "SELECT datname FROM pg_database where datallowconn = true;")
for db in $DB_LIST; do
    echo "Reindexing $db"
    /usr/bin/psql -q -f - $db << _EOF
REINDEX DATABASE $db;
REINDEX SYSTEM $db;
ALTER DATABASE $db REFRESH COLLATION VERSION;
_EOF
done

Script should be run as 'postgres' user.

# sudo -u postgres /path/to/script

Last edited by amish (2023-08-30 10:04:01)

Offline

#7 2024-02-24 02:18:02

jkhsjdhjs
Member
Registered: 2017-09-05
Posts: 39

Re: PostgreSQL collation warning

And for the third revision of the script big_smile, I added quotes around the $db variable so the script also works with databases that have hyphens in their name:

#!/usr/bin/env bash
DB_LIST=$(psql -Atc "SELECT datname FROM pg_database WHERE datallowconn = true;")
for db in $DB_LIST; do
    echo "Reindexing $db"
    /usr/bin/psql -q -f - "$db" << _EOF
REINDEX DATABASE "$db";
REINDEX SYSTEM "$db";
ALTER DATABASE "$db" REFRESH COLLATION VERSION;
_EOF
done

Offline

#8 2024-02-24 10:06:37

amish
Member
Registered: 2014-05-10
Posts: 507

Re: PostgreSQL collation warning

Which command failed without quotes? (where db had hyphen in its name)

Offline

#9 2024-02-24 11:12:07

jkhsjdhjs
Member
Registered: 2017-09-05
Posts: 39

Re: PostgreSQL collation warning

Both REINDEX commands and the ALTER command. The psql command itself doesn't fail without quotes and quoting $db isn't really necessary there, as it is preceeded by a hyphen, but I still quoted it because of a shellcheck warning. It's cleaner than adding a comment that makes shellcheck ignore the warning.

Offline

#10 2024-02-24 14:02:47

amish
Member
Registered: 2014-05-10
Posts: 507

Re: PostgreSQL collation warning

Do you mean hyphen or space? Does postgres treat hyphen in database name as something special? And thus throw error?

UPDATE: Googled it. I was not knowing that hyphen treated as special character in DB names. Lucky I never use hyphens in name.

Last edited by amish (2024-02-24 14:06:56)

Offline

Board footer

Powered by FluxBB