记一篇无法访问文件服务器的问题解决方案

问题

-文件服务器无法访问,nginx日志刷屏报错

2019/10/10 14:15:09 [alert] 22340#0: worker process 8784 exited on signal 25 2019/10/10 14:15:09 [alert] 22340#0: worker process 8785 exited on signal 25 2019/10/10 14:15:09 [alert] 22340#0: worker process 8786 exited on signal 25 2019/10/10 14:15:09 [alert] 22340#0: worker process 8787 exited on signal 25 2019/10/10 14:15:09 [alert] 22340#0: worker process 8788 exited on signal 25

-nginx worker_processes 配置

-不会产生core文件

The core file will not be generated if (a)    the process was set-user-ID and the current user is not the owner of the program file, or (b)     the process was set-group-ID and the current user is not the group owner of the file, (c)     the user does not have permission to write in the current working directory,  (d)     the file already exists and the user does not have permission to write to it, or  (e)     the file is too big (recall the RLIMIT_CORE limit in Section 7.11). The permissions of the core file (assuming that the file doesn't already exist) are usually user-read and user-write, although Mac OS X sets only user-read.

-gdb调试core

-ulimit打开文件限制 limits.conf调优的相关说明

问题起因

无法访问文件服务器,nginx进程被系统终止,无限重启:
因为ulimit系统资源限制,导致nginx在尝试打开文件时受限。

解决方案

  1. 通过tail -100f error.log监听nginx错误日志,发现nginx被无线杀死并重启。
#tail -100f error.log
2019/10/22 09:52:53 [alert] 24626#0: worker process 25426 exited on signal 25
2019/10/22 09:52:53 [alert] 24626#0: worker process 25427 exited on signal 25
2019/10/22 09:52:53 [alert] 24626#0: worker process 25428 exited on signal 25
2019/10/22 09:52:53 [alert] 24626#0: worker process 25429 exited on signal 25
2019/10/22 09:52:53 [alert] 24626#0: worker process 25430 exited on signal 25
2019/10/22 09:52:53 [alert] 24626#0: worker process 25431 exited on signal 25
2019/10/22 09:52:53 [alert] 24626#0: worker process 25432 exited on signal 25
2019/10/22 09:52:53 [alert] 24626#0: worker process 25434 exited on signal 25
2019/10/22 09:52:53 [alert] 24626#0: worker process 25436 exited on signal 25
2019/10/22 09:52:53 [alert] 24626#0: worker process 25439 exited on signal 25
2019/10/22 09:52:53 [alert] 24626#0: worker process 25440 exited on signal 25 (core dumped)
2019/10/22 09:52:54 [alert] 24626#0: worker process 25441 exited on signal 25
2019/10/22 09:52:54 [alert] 24626#0: worker process 25443 exited on signal 25
2019/10/22 09:52:54 [alert] 24626#0: worker process 25444 exited on signal 25
2019/10/22 09:52:54 [alert] 24626#0: worker process 25445 exited on signal 25
2019/10/22 09:52:54 [alert] 24626#0: worker process 25448 exited on signal 25
2019/10/22 09:52:54 [alert] 24626#0: worker process 25449 exited on signal 25 (core dumped)
2019/10/22 09:52:54 [alert] 24626#0: worker process 25450 exited on signal 25 (core dumped)
2019/10/22 09:52:54 [alert] 24626#0: worker process 25452 exited on signal 25 (core dumped)
2019/10/22 09:52:54 [alert] 24626#0: worker process 25453 exited on signal 25
2019/10/22 09:52:54 [alert] 24626#0: worker process 25454 exited on signal 25
2019/10/22 09:52:54 [alert] 24626#0: worker process 25455 exited on signal 25 (core dumped)
...
  1. 通过ulimit -a查看core文件限制是否关闭。
# ulimit -a
core file size          (blocks, -c) 24414		\\限制为24414个块,大概就是3.5G左右

没有限制core文件,去查看一下有没有生成core文件。

#ls -alhS /data/coredump/ | head -n 100			\\按文件大小排序,并显示所有文件的容量按单位显示前100个。
total 3.7G
-rw------- 1 nobody nobody  25M Oct 22 09:32 core.nginx.6836
-rw------- 1 nobody nobody  25M Oct 22 09:32 core.nginx.6862
-rw------- 1 nobody nobody  25M Oct 22 08:57 core.nginx.6913
-rw------- 1 nobody nobody  25M Oct 22 08:57 core.nginx.6844
-rw------- 1 nobody nobody  24M Oct 22 08:22 core.nginx.6815
-rw------- 1 nobody nobody  24M Oct 22 08:57 core.nginx.6822
-rw------- 1 nobody nobody  24M Oct 22 06:37 core.nginx.6835
-rw------- 1 nobody nobody  24M Oct 22 07:47 core.nginx.6845
-rw------- 1 nobody nobody  24M Oct 22 09:32 core.nginx.6846
-rw------- 1 nobody nobody  24M Oct 22 09:32 core.nginx.6848
-rw------- 1 nobody nobody  24M Oct 22 04:16 core.nginx.6871
-rw------- 1 nobody nobody  24M Oct 22 08:57 core.nginx.6872
-rw------- 1 nobody nobody  24M Oct 22 08:57 core.nginx.6873
...
  1. 通过gdb调试工具调试core文件:
# gdb core.nginx.6838 
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
BFD: Warning: /data/coredump/core.nginx.6838 is truncated: expected core file size >= 37912576, found: 356352.
BFD: Warning: /data/coredump/core.nginx.6838 is truncated: expected core file size >= 37912576, found: 356352.
Missing separate debuginfo for the main executable file
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/f5/918a454c5b0330bb403453581e77ae8a783b64
[New Thread 6838]
Failed to read a valid object file image from memory.
Core was generated by `nginx: worker process                                                         '.
Program terminated with signal 25, File size limit exceeded.		\\这里看出是因为超出的了文件大小的限制
#0  0x00007f325d68c660 in ?? ()
<b 
(gdb) q		\\q退出gdb
  1. 设置资源限制阈值:
#ulimit -a		\\查看当前限制
core file size          (blocks, -c) 24414
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) 1024000		\\文件大小限制
pending signals                 (-i) 127426
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) 81920
open files                      (-n) 65535
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 127426
virtual memory          (kbytes, -v) 1048576
file locks                      (-x) unlimited
#vim /etc/profile
+ulimit -f 10240000		\\添加一行
:x		\\保存并退出
#source	/etc/profile		\\重新加载文件
#ulimit -a
core file size          (blocks, -c) 24414
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) 10240000		\\文件大小已更新
pending signals                 (-i) 127426
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) 81920
open files                      (-n) 65535
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 127426
virtual memory          (kbytes, -v) 1048576
file locks                      (-x) unlimited
  1. 在此终端,重启nginx
#/usr/local/nginx/sbin/nginx -s stop		\\停止nginx
#/usr/local/nginx/sbin/nginx -p /usr/local/nginx -c /usr/local/nginx/conf/nginx.conf		\\启动nginx
#tail -100f ./logs/error.log		\\监听日志
2019/10/22 09:52:54 [alert] 24626#0: worker process 25448 exited on signal 25
2019/10/22 09:52:54 [alert] 24626#0: worker process 25449 exited on signal 25 (core dumped)
2019/10/22 09:52:54 [alert] 24626#0: worker process 25450 exited on signal 25 (core dumped)
2019/10/22 09:52:54 [alert] 24626#0: worker process 25452 exited on signal 25 (core dumped)
2019/10/22 09:52:54 [alert] 24626#0: worker process 25453 exited on signal 25
2019/10/22 09:52:54 [alert] 24626#0: worker process 25454 exited on signal 25
2019/10/22 09:52:54 [alert] 24626#0: worker process 25455 exited on signal 25 (core dumped)


\\没有报错了,正常访问。

解决完毕。具体细节请参考资料。

参考资料

Linux查看系统block size大小的方法
/etc/security/limits.conf的相关说明
Gdb 调试core文件详解
nginx worker_processes 配置
nginx 子进程 woker process 启动失败的问题
nginx防盗链未配置禁止蜘蛛导致进程终止参考这条
nginx配置https/ssl导致网站页面无法正常访问

# Linux  coredump  ulimit  nginx  TLS 

评论

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×