Dell 1318 ACPI problems

gnemmi

Active Member

Reaction score: 21
Messages: 220

Well .. I've been playing around with 7.2-RELEASE for the last few days on my DELL 1318 and I've got all sorts of problems in here ... ranging from:

Code:
gargoyle# cp /boot/defaults/loader.conf /boot/loader.conf
gargoyle# reboot
resulting in:

Code:
Loading: /boot/defaults/loader.conf
Loading: /boot/defaults/loader.conf
Loading: /boot/defaults/loader.conf
Loading: /boot/defaults/loader.conf
Loading: /boot/defaults/loader.conf
Error: stack overflow
|
can't load 'kernel'

Type '?' for a list of commands, 'help' for more detaild help.
OK _
which I end up resolving by issuing a simple:

Code:
OK boot-conf
To ACPI issues ...

Here's the info I've gathered so far regarding ACPI ... if somebody finds it usefull and can give me a hand, I'll be more than gratefull ...

systctl hw.acpi results (battery is not attached to the system): sysctl -a | grep hw.acpi | sort

acpiconf -s 3 gets the machine into suspend state, but it wont't resume .. it throws something like this:

Code:
bge0: PHY write timeout (phy1, reg 0, val 32768)
bge0: PHY write timeout (phy1, reg 24, val 0xffffff)
bge0: PHY write timeout (phy1, reg 16, val0xffffff)
bge0: PHY write timeout (phy1, reg 16, val0xffffff)
bge0: PHY write timeout (phy1, reg 24, val 0xffffff)
bge0: flow trough queue init failed 
bge0: initialization failure
then something similar with fwohci(4) .. then ad4 problems and then it all ends up with a hard power off :(

The problem is that I don't know how to catch those messages so I can't post them in here .. I've looked everywhere in the system but I couldn't find them so far .. it seems those are not logged anywhere.

acpiconf -s 4 and acpiconf -s 4BIOS have a similar result. The machine hangs with this message:

Code:
gargoyle# acpiconf -s 4BIOS
fwochi0: fwochi_pci_suspend
Hard power off is mandatory after that ...

I don't have a place to upload my ASL yet, but I'll find one soon .. meanwhile here's the result output from: gargoyle# iasl gonzalo-Dell1318.asl

Here's my boot -v (acpi enabled)

Regretably ..

Code:
OK unset acpi_conf
OK boot -v
and

Code:
OK set acpi_conf=NO
OK boot -v
both result on a "fatal trap 9" core dump son I don't know how to get a "boot -v" with ACPI disabled :(

some more info just in case:

kldstat
Code:
> kldstat
Id Refs Address    Size     Name
 1    7 0xc0400000 9fab28   kernel
 2    1 0xc0dfb000 6a45c    acpi.ko
 3    1 0xc4658000 22000    linux.ko
vmstat -i
Code:
> vmstat -i
interrupt                          total       rate
irq1: atkbd0                         416          1
irq9: acpi0                            1          0
irq14: ata0                           58          0
irq17: atapci1                      2717          8
irq19: fwohci0                         3          0
cpu0: timer                       644441       1995
irq256: bge0                         358          1
Total                             647994       2006
>
I installed Mandriva 2009.1 Spring just to check whether it had suspend/hibernate support for this particular model, and they do .. and pretty well actually .. so .. I _really_ want to get this fixed ...

Please, if you or someone you know might be interested, let me know and I'll do my best to help solve this issues.

Thanks in advanced :)
 

richardpl

Aspiring Daemon

Reaction score: 68
Messages: 841

gnemmi said:
Well .. I've been playing around with 7.2-RELEASE for the last few days on my DELL 1318 and I've got all sorts of problems in here ... ranging from:

Code:
gargoyle# cp /boot/defaults/loader.conf /boot/loader.conf
gargoyle# reboot
You should never do that, undo it.
resulting in:

Code:
Loading: /boot/defaults/loader.conf
Loading: /boot/defaults/loader.conf
Loading: /boot/defaults/loader.conf
Loading: /boot/defaults/loader.conf
Loading: /boot/defaults/loader.conf
Error: stack overflow
|
can't load 'kernel'

Type '?' for a list of commands, 'help' for more detaild help.
OK _
Your change caused loop that never ends.

acpiconf -s 3 gets the machine into suspend state, but it wont't resume .. it throws something like this:
on 7.2 RELEASE SMP resume doesn't work.
 
OP
OP
gnemmi

gnemmi

Active Member

Reaction score: 21
Messages: 220

Thanks for the heads up richardpl!

Actually, I always did it that way so I could make all my changes on /boot/loader.conf while keeping a pristine copy of /boot/defaults/loader.conf (which would get overwritten on an update, thus making me loose the changes I had introduced on my loader.conf config.. and to be honest.. I always thought that was the idea behind the whole "defaults" scheme) it always worked ok in all of my machines. This is the first time (and install) in which I get this stack overflow.

Actually .. I think I recall reading that for the last time on "Absolute FreeBSD 2nd Edition" by M. Lucas, page 68 :\

I thought the line:

Code:
loader_conf_files="/boot/device.hints /boot/loader.conf /boot/loader.local.conf"
was supposed to prevent the loop in case there was a /boot/loader.conf or /boot/loader.local.conf file present on the system and override the values on /boot/defaults/loader.conf with those in /boot/loader.conf.

What would be the best way to proceed then?

Will recompile the kernel, get rid of SMP (I have a Celeron 560 in here), bge(4) and fwohci(4) and load them through loader.conf to see what happens and if I still get the same behaviour, maybe I should list them on /etc/rc.suspend and /etc/rc.resume and see what I get ...
 

richardpl

Aspiring Daemon

Reaction score: 68
Messages: 841

gnemmi said:
Thanks for the heads up richardpl!

Actually, I always did it that way so I could make all my changes on /boot/loader.conf while keeping a pristine copy of /boot/defaults/loader.conf (which would get overwritten on an update, thus making me loose the changes I had introduced on my loader.conf config.. and to be honest.. I always thought that was the idea behind the whole "defaults" scheme) it always worked ok in all of my machines. This is the first time (and install) in which I get this stack overflow.

Actually .. I think I recall reading that for the last time on "Absolute FreeBSD 2nd Edition" by M. Lucas, page 68 :\

I thought the line:

Code:
loader_conf_files="/boot/device.hints /boot/loader.conf /boot/loader.local.conf"
was supposed to prevent the loop in case there was a /boot/loader.conf or /boot/loader.local.conf file present on the system and override the values on /boot/defaults/loader.conf with those in /boot/loader.conf.
If you ever read file you copied so blindly you would edit one line at begining....

What would be the best way to proceed then?

Will recompile the kernel, get rid of SMP (I have a Celeron 560 in here), bge(4) and fwohci(4) and load them through loader.conf to see what happens and if I still get the same behaviour, maybe I should list them on /etc/rc.suspend and /etc/rc.resume and see what I get ...
You dont need to recompile kernel to get rid of SMP, on your CPU resume should work with other kind of hacks. In your case you will need to find which driver(s) cause suspend/resume problems.
 
OP
OP
gnemmi

gnemmi

Active Member

Reaction score: 21
Messages: 220

Oh no .. sure .. it wasn't a "blind" copy ... what I meant was that I did copy /boot/defaults/loader.conf to /boot/loader.conf in order to do all my edits in there .. and that I did so .. yet still, I get that stack overflow. Sorry for the misunderstandig; I should've explained myself better !!

Regarding ACPI .. it seems it's a full _no_go_

I did recompile my kernel, removed SMP, bge and firewire (fwohci) ... loaded bge through /boot/defaults/loader.conf (which I ended up using because using /boot/loader.conf results on a stack overflow as explained above), got completely rid of fwohci, added [cmd=]kldunload if_bge[/cmd] to /etc/rc.suspend (which does unload miibus and bge as expected as soon as I issue a # acpiconf -s 3)and [cmd=]kldload if_bge[/cmd] to /etc/rc.resume, but it never gets to do its biding because ad4 won't get up from suspend ... so ... there seems to be a problem with the ata driver.

I still get the same results ... have to hard power off .. and booting "without" ACPI still results in a "Fatal trap 9" kernel dump prompting me to press any key to reboot :(

Any help is appreciated !
 
OP
OP
gnemmi

gnemmi

Active Member

Reaction score: 21
Messages: 220

I just removed FreeBSD 7.2 and installed PC-BSD 7.1 (7.2-PRE) just in case, but i have the same problems ... although PC-BSD handles them more gracefully ...

Will try acpi@ and see what happens ... AFAIC FreeBSD acpi seems to be really hosed ...

I really hope I can get this sorted out because I don't want to move away from FreeBSD but I'll have to if it can't handle ACPI :(

If you can help, please picth in: I'm willing to do whatever it takes to solve this issue, run whatever test provide any info and apply any patch to get this thig working.

AS for now, my advice would be: If you are in the market looking for a notebook to run FreeBSD on, _DO_NOT_, and I repeat .. _DO_NOT_ buy a Dell 1318. x(

Regards.
 

richardpl

Aspiring Daemon

Reaction score: 68
Messages: 841

What hard disk? Maybe ata on CURRENT is in better shape for you.

Could you provide link to phantom /boot/loader.conf you copied so "blindly" :)
 
OP
OP
gnemmi

gnemmi

Active Member

Reaction score: 21
Messages: 220

Regarding the disk, is the one you can find in the [cmd=]boot -v[/cmd] in my first post. Here you go:

Code:
atapci0: <Intel ICH8M UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x6fa0-0x6faf irq 16 at device 31.1 on pci0
atapci0: Reserved 0x10 bytes for rid 0x20 type 4 at 0x6fa0
ata0: <ATA channel 0> on atapci0
atapci0: Reserved 0x8 bytes for rid 0x10 type 4 at 0x1f0
atapci0: Reserved 0x1 bytes for rid 0x14 type 4 at 0x3f6
ata0: reset tp1 mask=03 ostat0=50 ostat1=00
ata0: stat0=0x00 err=0x01 lsb=0x14 msb=0xeb
ata0: stat1=0x00 err=0x00 lsb=0x00 msb=0x00
ata0: reset tp2 stat0=00 stat1=00 devices=0x4<ATAPI_MASTER>
ioapic0: routing intpin 14 (ISA IRQ 14) to vector 54
ata0: [MPSAFE]
ata0: [ITHREAD]
atapci1: <Intel (ID=28298086) AHCI controller> port 0x6eb0-0x6eb7,0x6eb8-0x6ebb,0x6ec0-0x6ec7,0x6ec8-0x6ecb,0x6ee0-0x6eff mem 0xf6dfb800-0xf6dfbfff irq 17 at device 31.2 on pci0
atapci1: Reserved 0x20 bytes for rid 0x20 type 4 at 0x6ee0
atapci1: Reserved 0x800 bytes for rid 0x24 type 3 at 0xf6dfb800
ioapic0: routing intpin 17 (PCI IRQ 17) to vector 55
atapci1: [MPSAFE]
atapci1: [ITHREAD]
atapci1: AHCI Version 01.10 controller with 3 ports detected
ata2: <ATA channel 0> on atapci1
ata2: SATA connect time=0ms
ata2: SIGNATURE: 00000101
ata2: ahci_reset devices=0x1<ATA_MASTER>
ata2: [MPSAFE]
ata2: [ITHREAD]
ata3: <ATA channel 1> on atapci1
ata3: port not implemented
ata3: [MPSAFE]
ata3: [ITHREAD]
ata4: <ATA channel 2> on atapci1
ata4: SATA connect status=00000004
ata4: ahci_reset devices=0x0
ata4: [MPSAFE]
ata4: [ITHREAD]
...
...
ata0-master: pio=PIO4 wdma=WDMA2 udma=UDMA33 cable=80 wire
acpi_acad0: acline initialization start
acd0: setting PIO4 on ICH8M chip
acpi_acad0: On Lineacd0: setting UDMA33 on ICH8M chip
 
acpi_acad0: acline initialization done, tried 1 times
battery0: battery initialization start
acd0: <MATSHITA DVD+/-RW UJ-875S/D200> DVDR drive at ata0 as master
acd0: read 4134KB/s (4134KB/s) write 4134KB/s (4134KB/s), 2048KB buffer, UDMA33
acd0: Reads: CDR, CDRW, CDDA stream, DVDROM, DVDR, DVDRAM, packet
acd0: Writes: CDR, CDRW, DVDR, DVDRAM, test write, burnproof
acd0: Audio: play, 256 volume levels
acd0: Mechanism: ejectable caddy, unlocked
acd0: Medium: no/blank disc
ata2-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
ad4: 238475MB <Seagate ST9250827AS 3.ADB> at ata2-master SATA300
ad4: 488397168 sectors [484521C/16H/63S] 16 sectors/interrupt 1 depth queue
GEOM: new disk ad4
ad4: Intel check1 failed
ad4: Adaptec check1 failed
ad4: LSI (v3) check1 failed
ad4: LSI (v2) check1 failed
ad4: FreeBSD check1 failed
GEOM_LABEL: Label for provider ad4s1 is msdosfs/DellUtility.
GEOM_LABEL: Label for provider ad4s2 is ntfs/OS.
GEOM_LABEL: Label for provider ad4s4a is ufsid/4a0448b836027261.
GEOM_LABEL: Label for provider ad4s4d is ufsid/4a0448bb6631b324.
GEOM_LABEL: Label for provider ad4s4e is ufsid/4a0448b80bec5d96.
GEOM_LABEL: Label for provider ad4s4f is ufsid/4a0448b836ae7b99.
(probe0:sbp0:0:0:0): error 22
(probe0:sbp0:0:0:0): Unretryable Error
(probe1:sbp0:0:1:0): error 22
(probe1:sbp0:0:1:0): Unretryable Error
(probe2:sbp0:0:2:0): error 22
(probe2:sbp0:0:2:0): Unretryable Error
(probe3:sbp0:0:3:0): error 22
(probe3:sbp0:0:3:0): Unretryable Error
(probe4:sbp0:0:4:0): error 22
(probe4:sbp0:0:4:0): Unretryable Error
(probe5:sbp0:0:5:0): error 22
(probe5:sbp0:0:5:0): Unretryable Error
(probe6:sbp0:0:6:0): error 22
(probe6:sbp0:0:6:0): Unretryable Error
ATA PseudoRAID loaded
...
the one that doesn't come back from resume is ad4 ...

Regarding my /boot/loader.conf, I'm sorry to say that I can no longer provide it to you because as I said in my previous post, I wiped FreeBSD 7.2 out in order to install PC-BSD 7.1. But what I can do is provide you with another two /boot/loader.conf files that do not give me the problem I have with 7.2

This is the /boot/loader.conf from my desktop (FreeBSD 7.0-RELEASE .. the weird values you'll see in there are beacuse I'm using Oliver Fromme's graphical BootLoader) and this is the /boot/loader.conf one from my internal server (FreeBSD 7.1-RELEASE). Both are extremly similar to the one I was using on my notebook except that I have most modules compiled into my kernel on those machines, whereas I was using /boot/loader.conf to load the modules (if_bge, snd_hda, etc ) on my notebook. None of those two /boot/loader.conf cause a stack overflow as seen on 7.2-RELEASE.

Hope you find them usefull !

Thanks :)
 
OP
OP
gnemmi

gnemmi

Active Member

Reaction score: 21
Messages: 220

Just in case someone finds it usefull or interesting, this is what you get when you boot without ACPI on a Dell 1318:

Code:
Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; acpic id = 00
instruction pointer		= 0x70:0xbfe4
stack pointer			= 0x28:0xfa4
frame pointer			= 0x28:0xfd4
code segment			= base 0xc00f0000, limit 0xffff, type 0x1b
				= DPL 0, pres 1, def32 0, gran 0
processor eflags		= interrupt enabled, resume, IOPL = 0
current process			= 0 (swapper)
trap number			= 9
panic: general protection fault
cpuid = 0
Uptime: 1s
Automatic reboot in 15 sedconds - press a key on the console to abort
 
OP
OP
gnemmi

gnemmi

Active Member

Reaction score: 21
Messages: 220

No, there was no need to do so as this notebook came with the newest BIOS version already installed (A04).

Thanks for your interest vermaden, and if you need more info or would like me running some tests, please ask me for it and I'll do it right away.

Best Regards
 

vermaden

Son of Beastie

Reaction score: 1,180
Messages: 2,764

@gnemmi

As richardpl said, suspend does not work on SMP systems on i386, I have heard that it works on SMP systems on amd64, you should try that.

About /boot/loader.conf issue I would just clean it out for testing:
[cmd=]# :> /boot/loader.conf[/cmd]
 
OP
OP
gnemmi

gnemmi

Active Member

Reaction score: 21
Messages: 220

SMP: I did recompile my kernel disabling SMP as I posted above but to no avail .. results were exactly the same ... ad4 doesn't come back from suspend and I can't find the place in which the error messages it throws get logged (there's nothing about it on /var/log/messages or under /var/crash). Booting without ACPI yields the same Fatal trap9 :(

Regarding /boot/loader.conf I did what I knew I had to from moment one .. that is .. I created a blank /boot/loader.conf and inserted only the lines that I needed to override in /boot/defaults/loader.conf.

It works as expected, but doesn't really solve the "problem" .. it just hides the "symptom" ... in the sense that in the past I could do a "blind" (or almost blind) copy of /boot/defaults/loader.conf without getting a stack overflow, but with 7.2 that changed .. and there's a stack overflow right there at the tips of anyones fingers. That was kind of my point ...

Best Regards
 

vermaden

Son of Beastie

Reaction score: 1,180
Messages: 2,764

Maybe try 8-CURRENT, ACPI/Suspend case is pretty fscked up on BSDs.

Other possibilities are: contact FreeBSD developers and/or submit bugs.
 
OP
OP
gnemmi

gnemmi

Active Member

Reaction score: 21
Messages: 220

Yes .. I was thinking about that .. I think I have gathered enough info as to send some mails to acpi@ and kernel@ ...

Will see what happens there and post back in here as soon as I have something relevant that may help other users.

Thanks for your help vermaden !!

Regards

PS: I forgot to mention this, but the most incredible thing about this whole issue is that FreeBSD will suspend (acpiconf -s 3) flawlessly if you run it from the livefs cd (Live CD or Fixit) ... XD
 

vermaden

Son of Beastie

Reaction score: 1,180
Messages: 2,764

gnemmi said:
Thanks for your help vermaden !!
You are welcome mate, but I do not feel that I have helped here.

gnemmi said:
PS: I forgot to mention this, but the most incredible thing about this whole issue is that FreeBSD will suspend (acpiconf -s 3) flawlessly if you run it from the livefs cd (Live CD or Fixit) ... XD
Remember to mention this on MLs, it may be good point to start hunting bugs, but I am not a developer so ... good luck ;)
 
OP
OP
gnemmi

gnemmi

Active Member

Reaction score: 21
Messages: 220

vermaden said:
You are welcome mate, but I do not feel that I have helped here.
That fact was the last push I needed to send a mail to acpi@ ... If you couldn't help, who (amongst the usual _big_time_helpers_) could? :)

vermaden said:
Remember to mention this on MLs, it may be good point to start hunting bugs, but I am not a developer so ... good luck ;)
Will do and thanks once again!!

Best Regards
 

vermaden

Son of Beastie

Reaction score: 1,180
Messages: 2,764

@gnemmi

Thanks, good to know that even such "non helpable" things help ;)
 
OP
OP
gnemmi

gnemmi

Active Member

Reaction score: 21
Messages: 220

Finally got the complete error messages .. posting in here for the record.

Code:
bge0: PHY write timed out (phy1, reg 0, val 32768)
bge0: PHY read timed out (phy1, reg 0, val 0xffffff)
bge0: PHY read timed out (phy1, reg 24, val 0xffffff)
bge0: PHY read timed out (phy1, reg 16, val 0xffffff)
bge0: PHY write timed out (phy1, reg 16, val 0)
bge0: PHY read timed out (phy1, reg 16, val 0xffffff)
bge0: PHY write timed out (phy1, reg 16, val 0)
bge0: PHY write timed out (phy1, reg 23, val 18)
bge0: flow-through queue init failed
bge0: initialization failure
fwohci0: Phy 1394a available S400, 1 ports.
fwohci0: Link S400, max_rec 2048 bytes.
fwohci0: Initiate bus reset
fwohci0: BUS reset
fwohci0: node_id=0xc000ffc0, gen=1, CYCLEMASTER mode
firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me)
firewire0: bus manager 0 (me)
fwohci0: unrecoverable error
ad4: WARNING - SETFEATURES SET TRANSFERMODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES SET TRANSFERMODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ata3: port not implemented
ad4: WARNING - SETFEATURES SET TRANSFERMODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES SET TRANSFERMODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=320041041
ad4: WARNING - SETFEATURES SET TRANSFERMODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES SET TRANSFERMODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=271558673
ad4: WARNING - SETFEATURES SET TRANSFERMODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES SET TRANSFERMODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: TIMEOUT - READ_DMA48 retrying (0 retries left) LBA=320041041
ad4: WARNING - SETFEATURES SET TRANSFERMODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES SET TRANSFERMODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=271558673
ad4: WARNING - SETFEATURES SET TRANSFERMODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES SET TRANSFERMODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: FAILURE - READ_DMA48 timed out LBA=320041041
g_vfs_done()ad4s4f[READ(offset=22751313920, length=6144)]error = 5 vnode_pager_getpages: I/O read error
and then it keeps on moving from LBA to LBA ...
 
OP
OP
gnemmi

gnemmi

Active Member

Reaction score: 21
Messages: 220

Sorry I didn't post before.

SAFE MODE(#3) gives me the same "Fatal Trap 9" that I get when I boot without ACPI(#2) ...

I think something is seriously hosed :(
 
Top