tty/pty problem

Hello, my apologies if this needs to be directed elsewhere.

It seems that for some reason, pty/tty's are not being properly allocated on this server. I initially noticed the problem when our rancid script would run, erroring out with:

Code:
clogin error: Error: telnet failed: The system has no more [FILE]pty[/FILE]s.  Ask your system administrator to create more.

During investigation, I thought perhaps it might be expect itself however, when attempting to login via SSH the system fails to allocate a tty as well. I've scoured the web in search for a solution to no avail. I've tried the following suggestions, none of which solved the problem.
[cmd=]/etc/rc.d/devfs restart[/cmd]
[cmd=]kill -HUP 1[/cmd]

I just noticed this happening in the last couple of days so I'm not sure if this is a new problem with this system or an old one and no one had noticed. Looking at other systems of the same configuration (FreeBSD 8.x), I noticed that there are no /dev/pty* devices and very few /dev/tty* devices.

To sum up:
  • the system has no other user login accounts using the system other than root and the local rancid user.
  • the system cannot allocate a tty after after nine logins are already established (this happens whether or not rancid is currently doing its job)
  • when rancid is being run (no other user logged in) if the "number of devices to collect simultaneously" is increased to ten, expect fails to allocate a pty
  • it seems that the tty/pty limit is dead set between 9 and 10, even though there are plenty of pty/tty devices in /dev.

I have tried to recreate some tty/pty's using mknod (is this even required to do anymore ie: MAKEDEV???). I have not rebooted the system, as this is a production machine (saving for last ditch effort).

Any insight/help would be much appreciated. Thanks in advance.

Code:
# uname -imsv
FreeBSD FreeBSD 7.0-RELEASE #0: Mon Apr 28 17:36:07 CDT 2008     root@:/usr/obj/usr/src/sys/COLONEL  amd64 GENERIC

# uptime
11:39AM  up 448 days,  4:06, 6 users, load averages: 0.26, 0.22, 0.26

# grep tty /usr/src/sys/amd64/conf/COLONEL
device          pty             # Pseudo-ttys (telnet etc)

There are some other constant process running, however, they are taking up all the tty's

Code:
# ps -o tty | wc -l
      48
[...]
# ps -o user,tty,start
USER TTY      STARTED
root ttyv0    -
root ttyv1    -
root ttyv2    -
root ttyv3    -
root ttyv4    -
root ttyv5    -
root ttyv6    -
root ttyv7    -
root consolectl -
root consolectl -
root consolectl -
root consolectl  8Apr11
root consolectl  8Apr11
root consolectl  8Apr11
root consolectl  8Apr11
root consolectl  8Apr11
root consolectl  8Apr11
root consolectl  8Apr11
root consolectl  8Apr11
root consolectl  8Apr11
root ttyMq    27Feb12
root ttyMq    27Feb12
root ttyMq    27Feb12
root ttyMq    27Feb12
root ttyMq    27Feb12
root ttyMq    27Feb12
root ttyMq    27Feb12
root ttyMq    27Feb12
root ttyMq    27Feb12
root ttyMq    27Feb12
root ttyMq    27Feb12
root ttyMq    27Feb12
root ttyMq    27Feb12
root ttyMq    27Feb12
root ttyMq    27Feb12
root ttyO5    Tue04PM
root ttyOc    Tue04PM
root ttyOd    Tue04PM
root ttyOk    Thu08AM
root ttyOk    Thu08AM
root ttyOl    11:49AM
root ttyOl    11:49AM
root ttyOm    11:36AM
root ttyOm    12:40PM
root ttyOp     2:19PM
root ttyOu    12:20PM

pty/tty entires, (I also looked at the last modification times for these, and a lot of them go as far back as Dec 2011)

Code:
# ls -al /dev/pty* | wc -l
     512

# ls -al /dev/tty* | wc -l
     534
(I can send /etc/tty contents if anyone wants to check it, but it looks ok to me. It's default and it matches other working systems).

Code:
# grep -v # /etc/ttys | grep tty | wc -l
     526

Code:
# df | grep devfs
devfs                 1         1        0   100%    /dev

Code:
# w | grep root | wc -l
8

Code:
# ps -auxx | grep getty
root       1322  0.0  0.0  4668     0  v0  IWs+ -         0:00.00 /usr/libexec/getty Pc ttyv0
root       1323  0.0  0.0  4668     0  v1  IWs+ -         0:00.00 /usr/libexec/getty Pc ttyv1
root       1324  0.0  0.0  4668     0  v2  IWs+ -         0:00.00 /usr/libexec/getty Pc ttyv2
root       1325  0.0  0.0  4668     0  v3  IWs+ -         0:00.00 /usr/libexec/getty Pc ttyv3
root       1326  0.0  0.0  4668     0  v4  IWs+ -         0:00.00 /usr/libexec/getty Pc ttyv4
root       1327  0.0  0.0  4668     0  v5  IWs+ -         0:00.00 /usr/libexec/getty Pc ttyv5
root       1328  0.0  0.0  4668     0  v6  IWs+ -         0:00.00 /usr/libexec/getty Pc ttyv6
root       1329  0.0  0.0  4668     0  v7  IWs+ -         0:00.00 /usr/libexec/getty Pc ttyv7

When running expect:

Code:
# pkg_info | grep expect (devel version)
expect-nox11-5.44.1.7 A sophisticated scripter based on tcl/tk

# truss -o truss.log -f expect -c "spawn ls -al"
spawn ls -al
The system has no more ptys.  Ask your system administrator to create more.
    while executing
"spawn ls -al"

# cat truss.log
[...]
68541: open("/etc/group",O_RDONLY,0666)          = 5 (0x5)
68541: fstat(5,{mode=-rw-r--r-- ,inode=17145897,size=557,blksize=4096}) = 0 (0x0)
68541: lseek(5,0x0,SEEK_CUR)                     = 0 (0x0)
68541: lseek(5,0x0,SEEK_SET)                     = 0 (0x0)
68541: read(5,"# $FreeBSD: src/etc/group,v 1.35"...,4096) = 557 (0x22d)
68541: close(5)                                  = 0 (0x0)
68541: open("/dev/ptyp0",O_RDWR,00)              ERR#5 'Input/output error'
68541: open("/dev/ptyp1",O_RDWR,00)              ERR#5 'Input/output error'
68541: open("/dev/ptyp2",O_RDWR,00)              ERR#5 'Input/output error'
[...]
68541: open("/dev/ptyOv",O_RDWR,00)              ERR#5 'Input/output error'
68541: close(-1)                                 ERR#9 'Bad file descriptor'
68541: close(-1)                                 ERR#9 'Bad file descriptor'
68541: open("/",O_RDONLY,00)                     = 5 (0x5)
68541: close(5)                                  = 0 (0x0)
68541: write(2,"The system has no more ptys.  As"...,110) = 110 (0x6e)
68541: write(2,"\r\n",2)                         = 2 (0x2)
68541: open("/usr/local/lib/expect5.44.1.7/expect.rc",O_RDONLY,00) ERR#2 'No such file or directory'
68541: open("/root/.expect.rc",O_RDONLY,00)      ERR#2 'No such file or directory'
68541: ioctl(4,TIOCGETA,0xffffe640)              = 0 (0x0)
68541: fcntl(4,F_GETFL,)                         = 6 (0x6)
68541: fcntl(4,F_SETFL,0x2)                      = 0 (0x0)
68541: fcntl(4,F_GETFL,)                         = 2 (0x2)
68541: close(4)                                  = 0 (0x0)
68541: open("/dev/null",O_RDONLY,00)             = 4 (0x4)
68541: fcntl(4,F_SETFD,FD_CLOEXEC)               = 0 (0x0)
68541: close(2)                                  = 0 (0x0)
68541: open("/dev/null",O_RDONLY,00)             = 2 (0x2)
68541: fcntl(2,F_SETFD,FD_CLOEXEC)               = 0 (0x0)
68541: close(0)                                  = 0 (0x0)
68541: close(1)                                  = 0 (0x0)
68541: open("/dev/null",O_RDONLY,00)             = 0 (0x0)
68541: fcntl(0,F_SETFD,FD_CLOEXEC)               = 0 (0x0)
68541: fcntl(4,F_GETFL,)                         = 0 (0x0)
68541: fcntl(4,F_SETFL,0x0)                      ERR#25 'Inappropriate ioctl for device'
68541: close(4)                                  = 0 (0x0)
68541: close(2)                                  = 0 (0x0)
68541: fcntl(2,F_GETFL,0x800a4c4ec)              ERR#9 'Bad file descriptor'
68541: fcntl(2,F_SETFL,0xfffffffb)               ERR#9 'Bad file descriptor'
68541: fcntl(1,F_GETFL,0x800a4c4ec)              ERR#9 'Bad file descriptor'
68541: fcntl(1,F_SETFL,0xfffffffb)               ERR#9 'Bad file descriptor'
68541: fcntl(0,F_GETFL,)                         = 0 (0x0)
68541: fcntl(0,F_SETFL,0x0)                      ERR#25 'Inappropriate ioctl for device'
68541: close(0)                                  = 0 (0x0)
68541: process exit, rval = 0
[...]

SSH client:

Code:
# ssh -vv root@host
Password:
 [...]
debug1: Authentication succeeded (publickey).
debug1: channel 0: new [client-session]
debug2: channel 0: send open
debug1: Entering interactive session.
debug2: callback start
debug2: client_session2_setup: id 0
debug2: channel 0: request pty-req confirm 0
debug2: channel 0: request shell confirm 0
debug2: fd 3 setting TCP_NODELAY
debug2: callback done
debug2: channel 0: open confirm rwindow 0 rmax 32768
debug2: channel 0: rcvd adjust 131072
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.

(here it just hangs)

Server sshd debug log:

Code:
[...]
Jun 28 17:42:20 neteng sshd[68948]: debug1: Allocating pty.
Jun 28 17:42:20 neteng sshd[68948]: error: openpty: No such file or directory
Jun 28 17:42:20 neteng sshd[68948]: error: session_pty_req: session 0 alloc failed
[...]

I noticed that after the rancid script (rancid-run, from cron) has run, I see that the tty's are still assinged to user rancid.

Code:
# ls -al /dev/ | grep rancid
crw--w----   1 rancid  tty         4,  94 Jun 29 12:16 ttyOq
crw--w----   1 rancid  tty         4,  96 Jun 29 12:16 ttyOr
crw--w----   1 rancid  tty         4,  98 Jun 29 12:16 ttyOs
crw--w----   1 rancid  tty         4, 100 Jun 29 12:16 ttyOt

rancid config, set to only try four devices at a time,

Code:
[... rancid.conf ...]
# The number of devices to collect simultaneously.
PAR_COUNT=4; export PAR_COUNT

Code:
tty sysctl's
# sysctl -a | grep tty
kern.tty_nout: 701863739925
kern.tty_nin: 4478698124
kern.constty_wakeups_per_second: 5
kern.console: consolectl,/consolectl,ttyd0,
debug.ttydebug: 0
#
 
Anyone have any ideas? I'm guessing for some reason tty's/pty's are not being free'd properly after use? How would I go about freeing the unused pty's/tty's and making them available? I understand FreeBSD 8.x and up uses Unix98-style PTYS (/dev/ptmx). Unfortunately, upgrading is not an option at this time.
 
Yeah I am aware, and have come to the same conclusion. Doing further research, it seems there are some bugs with the pty/tty code in 7.x. I've done some testing, and applied various patches i've come across on the Freebsd lists, etc.., the only way to workaround the problem it seems, is to nuke all pty* tty* entries in /dev (probably not necessary, but makes me feel better) and reboot the system thus causing the remount of devfs /dev which recreates /dev/*ty* entries. unless there is a way to force devfs to reload/remount without causing problems?

Thanks for the reply.
 
Back
Top