CI work in progress

Today I made progress on running CI in GitLab. Here’s the branch:

The current situation is that tests run as root in the docker container, but
pg_ctl says “cannot be run as root”.

In other words, the scripts assume the user is unprivileged, but the docker
image (and the GitLab runner) assume the user is root.

Attempt 1:

I tried to to just set up the database with ‘sudo -u postgres’, but (a) the
postgres user is arbitrary here, and perhaps misleading, (b) the database still
can’t be accessed as root. “No such role” is the connection error.

I would need to invasively modify some scripts to work around (b), and I think
it might be better to just set up an actual database instead. But the scripts
are dead set on using a directory-local database cluster, so that would need to
change anyway.

Attempt 2:

I tried to use a USER command in the Dockerfile to switch to an unprivileged
user. That seems like a promising avenue, but by default the GitLab runner is
trying to put files in privileged places (/build) and they aren’t accessible by
unprivileged users. At least, I think that’s what is happening. See error
message here: test-build (#340419246) · Jobs · / snowdrift · GitLab .

Attempt 3:

Haven’t done this yet, but perhaps the best idea is to just set up an actual
database (we could use GitLab CI services, even). This is
probably already pretty well supported - the CI scripts just need to stop
setting up the local-directory cluster and use the PG environment variables.

Feel free to pick up this task - I won’t have time to look at it again for a
while. :slight_smile:

4 Appreciations

Is this something where any input and/or help from OSUOSL side could be applicable? I know for our main site, they will have normal ways to manage postgresql databases. Is it worth checking with them or asking them to at least look at this to see if they have any input?

Good to see updates on CI coming back online.

Regarding attempt 1, did you try a variation like su - postgres -c "some_command"? That worked for me in the Ghost Dockerfile to drop down to an unprivileged user.

Hope you will figure it out soon!

2 Appreciations

Not really. We already have all the infrastructure, thanks to GitLab—what’s missing is having code (in our repository) that makes less assumptions about where the database is during tests.

It looks like I have some tests working now, but the crowdmatch package is doing something totally different. I vaguely remember thinking it would be good to unify the two packages eventually. Maybe “eventually” is “now”.

1 Appreciation

That might work eventually, but it had most of the same problems as using sudo. It would require some code specialization. The nice thing about Attempt 3 (still in progress) is that it requires code generalization.

1 Appreciation

Indeed, but luckily you were almost done! See for the rest.

I’m not 100% sure what your ideas are wrt to the deploy job etc, so you’ll have to do that yourself :slight_smile:


@wolftune you could then try to deploy the SnowdriftReboot.keter artifact generated by the deploy-build stage of the pipeline, once it finishes:

Use the same method as before.

@photm, I see one thing that I think should be changed right away: the “deps” step of the pipeline currently only runs if stack.yaml changes. That might be too optimistic: for instance, it hasn’t fired yet! I think it’s probably best to get rid of the restriction and run it every time. It will make the pipeline take longer, but so be it.

Or maybe this!: Keep the current deps task, but add a sibling task that does the exact same thing, but is manual, rather than restricted to only run when stack.yaml changes. That way we get it to run automatically when we want it to run automatically, and we can do it manually when necessary (like right now).

Kinda yuck, but doing the best we can with what we’re given I’d say.

1 Appreciation

Maybe that’s because it has already run at an ancestor commit, namely one of the commits over at my fork?

1 Appreciation

But it looks like the cache doesn’t exist:

1mChecking cache for default…e[0;m
e[31;1mFATAL: file does not exist

1 Appreciation

True indeed. Playing around with it some more, I’m very confused as to when it runs :slight_smile:. Sent in (implementing exactly what you proposed).

1 Appreciation

@chreekat did major work today and CI/CD is basically all working! Some updates will happen when we move to OSUOSL, but we’re functional for now!

2 Appreciations

Woo, push-button deploy is back now, too!

Two MRs with some docs on the ops repo:

4 Appreciations