Posts: 1,436
Threads: 7
Joined: Jun 2010
No, it also requires disk access for logs, and calls fflush after every line. fflush isn't fsync though, and even in a datacenter with a dedicated storage server rather than disks right in the machines, I can't see how that could block for a whole second. Also, there's no measurable delay between writes before and after the spikes. Massive IO delays can of course happen, e.g. on the first access after some inactivity, but that's clearly not the case here.
Posts: 2,136
Threads: 50
Joined: Jun 2010
(20 Apr 12, 12:21PM)tempest Wrote: ...what's this?
Quote:Apr 15 14:40:19: 2 unarmed RVSF 0 202 9 8 1 144 normal xx.xxx.xx.xxx
Apr 15 14:41:13: 4 |AOD|FireFox. CLA 2 270 12 3 0 130 normal xx.xxx.xx.xxx
and
Quote:Apr 16 14:42:23: 0 TheEveMe CLA 0 94 9 18 0 147 normal xxx.xx.xxx.xx
Apr 16 14:42:53: 1 unarmed RVSF 0 44 5 9 1 163 normal xxx.xx.xxx.xx
...almost a second between iterations... That's twice you've mentioned '...a second...' now. You realise that's seconds you've highlighted? The first log stalls for 54 seconds, the second for 30.
Posts: 256
Threads: 15
Joined: Jun 2010
oh yeah, 'logline'. But that's been around since r4449, why would it break _now_ (provided it did). changes in some library? I wonder if any of you devs rebuilt 1104 with more recent gcc, would this issue magically be "fixed".
Posts: 1,436
Threads: 7
Joined: Jun 2010
20 Apr 12, 07:10PM
(This post was last modified: 20 Apr 12, 07:18PM by tempest.)
Er... nice fail there :D I guess I had to deal too much with milliseconds and microseconds recently.
Well, but that makes it much more obvious, and also easily explains the disconnects - no wonder people time out after half a minute.
Still no explanation for that timeout though, but the fact that it's almost a minute pretty much rules out all kinds of issues, from the scheduler spazzing out to disk latency. Well, not really, but that kind of stuff would have been noticed already. I'm fairly certain this is not a bug in AC - the only way it could stall for such a long time would be an infinite loop, and then it'd stall forever :P
You can never be sure though, so I'd like to get some more information from the people affected by this and try to find some patterns. For example, does this only happen on virtualized systems? Does it happens on systems other than CentOS and Ubuntu? On Windows? Is this actually with file logging enabled, or syslog, or both? Etc. etc.
Posts: 3,462
Threads: 72
Joined: Jun 2010
I haven't been clear on if you have a choice on linux (I always assumed you did), but this is using syslogs, you know the rest of the server specs in my first post.
Posts: 1,436
Threads: 7
Joined: Jun 2010
I don't know if and when syslog() blocks. Are you logging to a local destination, or via network?
Posts: 3,462
Threads: 72
Joined: Jun 2010
Posts: 3,462
Threads: 72
Joined: Jun 2010
I'm going to bump this as I just recompiled.
Our four servers are back online.
Post here if it explodes on you. We will say sorry and hang our heads.
Posts: 3,462
Threads: 72
Joined: Jun 2010
12 Jul 12, 10:58PM
(This post was last modified: 12 Jul 12, 10:58PM by Ronald_Reagan.)
I removed the -iferric.tk line.
I have a feeling that even though I have two IPs on that server that I didn't need the -i line and it was screwing things up. I guess I didn't know what I was doing. Waiting on confirmation that this worked, however after removing it in I haven't got any complaints. Either it worked or people stopped using it.
Posts: 2,136
Threads: 50
Joined: Jun 2010
servercmdline.txt Wrote:// don't use these switches, unless you really know what you're doing:
Posts: 3,462
Threads: 72
Joined: Jun 2010
(12 Jul 12, 10:58PM)Ronald_Reagan Wrote: [...]I guess I didn't know what I was doing. Waiting on confirmation that this worked[...]
Posts: 3,462
Threads: 72
Joined: Jun 2010
Doesn't seem to have worked.
Posts: 740
Threads: 61
Joined: Jun 2011
Try disabling public demo downloads, that caused lag for my servers when someone would go to download a demo. Also, maybe its the vps host.
Posts: 327
Threads: 20
Joined: Jul 2010
(16 Jul 12, 09:11PM)Jg99 Wrote: Try disabling public demo downloads, that caused lag for my servers when someone would go to download a demo. Also, maybe its the vps host.
so pro ^ public demo dls dont happen that often
Posts: 740
Threads: 61
Joined: Jun 2011
it was probably when demos got copied
Posts: 2,144
Threads: 38
Joined: Aug 2010
(19 Jul 12, 06:06AM)Jg99 Wrote: it was probably when demos got copied Except it would happen mid-game...
Posts: 3,462
Threads: 72
Joined: Jun 2010
Demos don't get copied, they get saved directly to the folder read by apache.
In case you were wondering, no, apache does not see the server folder.
|