Process MPD crash randomly ==> critical

Hello all,

I have a big problem with my freebsd 7.2 with one process (MPD 5.3)

my server is a dell R300 with 2 Gigas

Code:
PE R300 CORE 2 DUO E6405 (2.13GHZ, 2MB,
2GO POUR 1UC (2X1GO DIMM SIMPLE RANGEE)
80GO SATA 7200TR/MIN 3, 5"
All seems to be clean but ramdonly the process mpd5 crashed.

It can crash one or two or 5 times per days.

I have detected something :

when i lanch TOP -S command

i can see the process increase to approx 92180K and 87904K (for RES)
after it crash without any error into log files

Code:
last pid: 76409;  load averages:  0.24,  0.19,  0.16 up 144+05:18:25 16:30:39
72 processes:  3 running, 50 sleeping, 19 waiting
CPU:  0.2% user,  0.0% nice, 19.1% system, 11.4% interrupt, 69.2% idle
Mem: 93M Active, 334M Inact, 147M Wired, 292K Cache, 112M Buf, 1419M Free
Swap: 4052M Total, 4052M Free

PID   USERNAME   THR  PRI   NICE   SIZE    RES     STATE  C   TIME   WCPU  COMMAND
73434 root        1   44    0      92180K  87904K  select 1   0:14  0.00%  mpd5

I have never seen the process mpd5 consum more than 94000K it always crashed before.

the command limits show me that :

Code:
Resource limits (current):
  cputime          infinity secs
  filesize         infinity kB
  datasize           524288 kB
  stacksize           65536 kB
  coredumpsize     infinity kB
  memoryuse        infinity kB
  memorylocked     infinity kB
  maxprocesses         7390
  openfiles           65536
  sbsize           infinity bytes
  vmemoryuse       infinity kB

is there anything wrong ?

when i launch

dmesg

i have a lot of messages like that:

Code:
Limiting icmp unreach response from 1158 to 1000 packets/sec
Limiting icmp unreach response from 1116 to 1000 packets/sec

When i launch netstat -w 1 -I bge0

Code:
   nput       (bge0)           output
   packets  errs      bytes    packets  errs      bytes colls
     45915     0   30634546      47675     0   30669286     0
     42241     0   27539064      43755     0   27584617     0
     41016     0   27682952      42495     0   27721968     0
     39476     0   25983967      40738     0   26016017     0
     39966     0   25976848      41113     0   25974928     0
     42605     0   28045725      44197     0   28093727     0
     43259     0   28868911      45237     0   28878535     0
     42691     0   28352497      44597     0   28401179     0


There is no packets in error.


Last thing, the result of command sysctl -a

Code:
kern.ostype: FreeBSD
kern.osrelease: 7.2-RELEASE
kern.osrevision: 199506
kern.version: FreeBSD 7.2-RELEASE #0: Fri May  1 08:49:13 UTC 2009
    [email]root@walker.cse.buffalo.edu[/email]:/usr/obj/usr/src/sys/GENERIC

kern.maxvnodes: 100000
kern.maxproc: 8212
kern.maxfiles: 2097152
kern.argmax: 262144
kern.securelevel: -1
kern.hostname: NOMO-TH2-LTS1.nomotech.net
kern.hostid: 2364878361
kern.clockrate: { hz = 1000, tick = 1000, profhz = 2000, stathz = 133 }
kern.posix1version: 200112
kern.ngroups: 16
kern.job_control: 1
kern.saved_ids: 0
kern.boottime: { sec = 1254478364, usec = 475974 } Fri Oct  2 12:12:44 2009
kern.domainname:
kern.osreldate: 702000
kern.bootfile: /boot/kernel/kernel
kern.maxfilesperproc: 65536
kern.maxprocperuid: 7390
kern.ipc.maxsockbuf: 262144
kern.ipc.sockbuf_waste_factor: 8
kern.ipc.somaxconn: 4096
kern.ipc.max_linkhdr: 16
kern.ipc.max_protohdr: 60
kern.ipc.max_hdr: 76
kern.ipc.max_datalen: 128
kern.ipc.nmbjumbo16: 4224
kern.ipc.nmbjumbo9: 8448
kern.ipc.nmbjumbop: 16896
kern.ipc.nmbclusters: 65536
kern.ipc.piperesizeallowed: 1
kern.ipc.piperesizefail: 0
kern.ipc.pipeallocfail: 0
kern.ipc.pipefragretry: 0
kern.ipc.pipekva: 81920
kern.ipc.maxpipekva: 65000000
kern.ipc.msgseg: 2048
kern.ipc.msgssz: 8
kern.ipc.msgtql: 40
kern.ipc.msgmnb: 2048
kern.ipc.msgmni: 40
kern.ipc.msgmax: 16384
kern.ipc.semaem: 16384
kern.ipc.semvmx: 32767
kern.ipc.semusz: 136
kern.ipc.semume: 10
kern.ipc.semopm: 100
kern.ipc.semmsl: 60
kern.ipc.semmnu: 30
kern.ipc.semmns: 60
kern.ipc.semmni: 10
kern.ipc.semmap: 30
kern.ipc.shm_allow_removed: 0
kern.ipc.shm_use_phys: 0
kern.ipc.shmall: 8192
kern.ipc.shmseg: 128
kern.ipc.shmmni: 192
kern.ipc.shmmin: 1
kern.ipc.shmmax: 33554432
kern.ipc.maxsockets: 65536
kern.ipc.numopensockets: 168
kern.ipc.nsfbufsused: 0
kern.ipc.nsfbufspeak: 8
kern.ipc.nsfbufs: 8704
kern.dummy: 0
kern.ps_strings: 3217031152
kern.usrstack: 3217031168
kern.logsigexit: 1
kern.iov_max: 1024
kern.hostuuid: 44454c4c-4300-1057-8059-cac04f33344a
kern.cam.cam_srch_hi: 0
kern.cam.scsi_delay: 5000
kern.cam.cd.retry_count: 4
kern.cam.cd.changer.max_busy_seconds: 15
kern.cam.cd.changer.min_busy_seconds: 5
kern.cam.da.da_send_ordered: 1
kern.cam.da.default_timeout: 60
kern.cam.da.retry_count: 4
kern.cam.da.0.minimum_cmd_size: 6
kern.dcons.poll_hz: 100
kern.disks: da0
kern.geom.collectstats: 1
kern.geom.debugflags: 0
kern.geom.label.debug: 0
kern.elf32.fallback_brand: -1
kern.init_shutdown_timeout: 120
kern.init_path: /sbin/init:/sbin/oinit:/sbin/init.bak:/rescue/init:/stand/sysinstall
kern.acct_suspended: 0
kern.acct_configured: 0
kern.acct_chkfreq: 15
kern.acct_resume: 4
kern.acct_suspend: 2
kern.cp_time: 1617943 2955 385726702 197561134 2649676317
kern.openfiles: 174
kern.kq_calloutmax: 4096
kern.ps_arg_cache_limit: 256
kern.stackprot: 7
kern.randompid: 0
kern.lastpid: 76445
kern.ktrace.request_pool: 100
kern.ktrace.genio_size: 4096
kern.module_path: /boot/kernel;/boot/modules
kern.malloc_count: 260
kern.fallback_elf_brand: -1
kern.features.compat_freebsd6: 1
kern.features.compat_freebsd5: 1
kern.features.compat_freebsd4: 1
kern.maxusers: 512
kern.ident: GENERIC
kern.kstack_pages: 2
kern.shutdown.kproc_shutdown_wait: 60
kern.shutdown.poweroff_delay: 5000
kern.sync_on_panic: 0
kern.corefile: %N.core
kern.nodump_coredump: 0
kern.coredump: 1
kern.sugid_coredump: 0
kern.sigqueue.alloc_fail: 0
kern.sigqueue.overflow: 0
kern.sigqueue.preallocate: 1024
kern.sigqueue.max_pending_per_proc: 128
kern.forcesigexit: 1
kern.fscale: 2048

===============================================
Processes:              (RUNQ: 1 Disk Wait: 0 Page Wait: 0 Sleep: 19)
Virtual Memory:         (Total: 2299004K, Active 155540K)
Real Memory:            (Total: 194000K Active 101960K)
Shared Virtual Memory:  (Total: 7564K Active: 6140K)
Shared Real Memory:     (Total: 5020K Active: 4084K)
Free Memory Pages:      1453196K
I don't understand what's happened because only the process mpd5 crashed
all the rest of the processes continues to run normaly.

I have put a script that reload the mpd process but it isn't clean
and i want understand what happened.

if someone can have a look at this it will be very usefull.
 
All logs are clean anything inside show something wrong

it's like if the process close cleanly.

the only thing that i have seen is that for me
the system can't allocate more memory than ~91 000 K for MPD5 process

when it crash it is alway between 89108K and 93000K

sonetime it's take one hour to increase to this valor and sometime
it is one day

it is very hard to diagnostic
 
we have volontary choose to not put mpd log
because all inside was clean

even with log + all
 
Back
Top