Solved ezjail-admin Error: Could not start jail1. You need to start it by hand.

Hello,

I have problem with starting jail after I stopped it:

Code:
[16:20]root@srv# ezjail-admin list
STA JID  IP              Hostname                       Root Directory
--- ---- --------------- ------------------------------ ------------------------
ZR  1    XXX.187.YYY.39  jail1              /usr/jails/jail1
ZR  2    XXX.187.YYY.40  jail2              /usr/jails/jail2
ZR  3    XXX.187.YYY.37  jail3              /usr/jails/jail3
ZR  4    XXX.187.YYY.36  jail4              /usr/jails/jail4
ZR  5    XXX.187.YYY.38  jail5              /usr/jails/jail5

[16:20]root@srv# ezjail-admin stop jail1
Stopping jails: jail1.

[16:21]root@srv# ezjail-admin start jail1
Configuring jails:.
Starting jails: cannot start jail "jail1":
.
Error: Could not start jail1.
  You need to start it by hand.

[16:22]root@srv# jls
   JID  IP Address      Hostname                      Path
  1    XXX.187.YYY.39  jail1              /usr/jails/jail1
  2    XXX.187.YYY.40  jail2              /usr/jails/jail2
  3    XXX.187.YYY.37  jail3              /usr/jails/jail3
  4    XXX.187.YYY.36  jail4              /usr/jails/jail4
  5    XXX.187.YYY.38  jail5              /usr/jails/jail5

[16:24]root@srv# ezjail-admin list
STA JID  IP              Hostname                       Root Directory
--- ---- --------------- ------------------------------ ------------------------
ZS  N/A  XXX.187.YYY.39  jail1             /usr/jails/jail1
ZR  2    XXX.187.YYY.40  jail2              /usr/jails/jail2
ZR  3    XXX.187.YYY.37  jail3              /usr/jails/jail3
ZR  4    XXX.187.YYY.36  jail4              /usr/jails/jail4
ZR  5    XXX.187.YYY.38  jail5              /usr/jails/jail5

Why it still visible in jls list?

How can I fix it?
 
Apparently (part of) it is still running and it never stopped completely. A stopped jail should not show up with jls(1).
How can I stop it?

I checked /var/run for jail1.id and run "sockstat -4 -l | grep XXX.187.XXX.39" for some process but found nothing...
 
What happens if you do jexec 1 /usr/bin/login -f root?
Code:
[17:16]root@srv# jexec 1 /usr/bin/login -f root
jexec: execvp(): /usr/bin/login: No such file or directory


I found one process in stopped jail1:
Code:
[17:09]root@srv# ps -o pid,jid -awux | grep -w 1
  PID JID USER      %CPU %MEM    VSZ    RSS TT  STAT STARTED          TIME COMMAND
...
84014   1 mailnull   0.0  0.0      0     16 ??  DEJ   4:04PM       0:00.01 /usr/local/sbin/exim -bd -q30m
...

How can I kill it?

Commands like
Code:
kill -9 84014
killall -j 1
don't work
 
Looking at the state ("DEJ") it seems the process is trying to exit (E) but it's waiting for disk (D). Try not to use kill -9 as that kills a process pretty hard and may leave data files in a corrupted state as the process doesn't get the opportunity to exit gracefully. In this case however I don't think it matters much but applications like MySQL server should never be killed that way. I'd give the process a bit of time to exit gracefully, it may need a little more time for house keeping.

For now you could take a peek at what it's trying to do using truss -p <pid>. Hopefully that will give some clues why it's not stopping.
 
Code:
[20:09]root@srv#truss -p 84014
truss: can not attach to target process: No such process
But process is still in ps aux.
 
Last edited by a moderator:
Back
Top