我们来聊聊GZIPOutputStreamGZIPInputStream, 如果不关闭流会引起的内存泄露问题,以及GZIPStream申请和释放堆外内存的流程, Let’s do it!

引子

在我的工程里面又一个工具类 ZipHelper 用来压缩和解压 String

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;
/**
* 用来压缩和解压字符串
*/
public class ZipHelper {

// 压缩
public static String compress(String str) throws Exception {
if (str == null || str.length() == 0) {
return str;
}
ByteArrayOutputStream out = new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(out);
gzip.write(str.getBytes());
gzip.close();
return out.toString("ISO-8859-1");
}

// 解压缩
public static String uncompress(String str) throws Exception {
if (str == null || str.length() == 0) {
return str;
}
ByteArrayOutputStream out = new ByteArrayOutputStream();
ByteArrayInputStream in = new ByteArrayInputStream(str.getBytes("ISO-8859-1"));
GZIPInputStream gunzip = new GZIPInputStream(in);
byte[] buffer = new byte[1024];
int n;
while ((n = gunzip.read(buffer)) >= 0) {
out.write(buffer, 0, n);
}
return out.toString();
}
}

最近服务出现了占用swap空间的问题,初步定位为内存泄漏,最后通过分析定位到是 Native 方法Java_java_util_zip_Inflater_init一直在申请内存(关于分析方法可以查阅这篇博客内存泄露分析实战)但是没有释放,很有可能就是流没有关闭造成的,而这部分代码最大的问题就是没有在finally里面去关闭流,于是乎我打算改造这部分代码,利用 try-with-resource 语法糖,然后代码就被修改成了这样:

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;

/**
* Created by jacob.
*
* 用来压缩和解压字符串
*/
public class ZipHelper {

/**
* 压缩字符串
*
* @param str 待压缩的字符串
* @return 压缩后的字符串
* @throws Exception 压缩过程中的异常
*/
public static String compress(String str) throws Exception {
if (str == null || str.length() == 0) {
return str;
}
// ByteArrayOutputStream 和 ByteArrayInputStream 是一个虚拟的流,
// JDk源码中关闭方法是空的, 所以无需关闭, 为了代码整洁,还是放到了try-with-resource里面
try (ByteArrayOutputStream out = new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(out)) {
gzip.write(str.getBytes());
// gzip.finish();
return out.toString("ISO-8859-1");
}
}

/**
* 解压字符串
*
* @param str 待解压的字符串
* @return 解压后的字符串
* @throws Exception 解压过程中的异常
*/
public static String uncompress(String str) throws Exception {
if (str == null || str.length() == 0) {
return str;
}
try (ByteArrayOutputStream out = new ByteArrayOutputStream();
ByteArrayInputStream in = new ByteArrayInputStream(str.getBytes("ISO-8859-1"));
GZIPInputStream gunzip = new GZIPInputStream(in)) {
byte[] buffer = new byte[1024];
int n;
while ((n = gunzip.read(buffer)) >= 0) {
out.write(buffer, 0, n);
}
return out.toString();
}
}
}

是不是顺眼多了呐,可是这样的代码可以压缩的,在解压的时候会报错。一开始我以为是解压的代码出现了问题,最后才发现是因为压缩的时候没有成功压缩,导致解压的时候无法解压。报以下错误

Exception in thread "main" java.io.EOFException: Unexpected end of ZLIB input stream
at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:240)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at coderbean.ZipHelper.uncompress(ZipHelper.java:52)
at coderbean.Main.main(Main.java:12)

好好的代码怎么会突然压缩失败,后来发现的问题是在GZIPOutputStream中,在close()方法中会主动调用finish()方法。

/**
* Writes remaining compressed data to the output stream and closes the
* underlying stream.
* @exception IOException if an I/O error has occurred
*/
public void close() throws IOException {
if (!closed) {
finish();
if (usesDefaultDeflater)
def.end();
out.close();
closed = true;
}
}

在下面的方法中才会将压缩后的数据输出到输入流,由于原来的代码会调用 close()方法,从而间接调用了 finish() 方法。那我我们的try-with-resource到底出了什么问题,其实问题就在于执行close()的时间。

/**
* Finishes writing compressed data to the output stream without closing
* the underlying stream. Use this method when applying multiple filters
* in succession to the same output stream.
* 在该方法中才会将压缩后的数据输出到输入流,由于原来的代码会调用 close()方法,从而
* 间接调用了 finish() 方法。
* @exception IOException if an I/O error has occurred
*/
public void finish() throws IOException {
if (!def.finished()) {
def.finish();
while (!def.finished()) {
int len = def.deflate(buf, 0, buf.length);
if (def.finished() && len <= buf.length - TRAILER_SIZE) {
// last deflater buffer. Fit trailer at the end
writeTrailer(buf, len);
len = len + TRAILER_SIZE;
out.write(buf, 0, len);
return;
}
if (len > 0)
out.write(buf, 0, len);
}
// if we can't fit the trailer at the end of the last
// deflater buffer, we write it separately
byte[] trailer = new byte[TRAILER_SIZE];
writeTrailer(trailer, 0);
out.write(trailer);
}
}

try-with-resource 执行时机和条件

try-with-resource 是在 JDK7 中新增加的语法糖(其实就是抄的C#),用来自动执行流的关闭操作,只要该类实现了AutoCloseableclose()方法。


package java.lang;

public interface AutoCloseable {
/**
* @throws Exception if this resource cannot be closed
*/
void close() throws Exception;
}

实现了这个接口之后,我们可以将会在try代码块执行结束之后自动关闭流

try(/* 在此处初始化资源 */){
// do something
} //在代码块执行结束前最后一步关闭流

由于在GZIPOutputStream执行了finish()方法或者close()方法之后才会真正的将压缩后的数据写入流,在上文我改造的代码中并没有首先执行finish()方法,而是直接在try代码块执行完之后关闭了流 GZIPOutputStream, 由于close()方法执行在out.toString("ISO-8859-1")之后,因此压缩并没有真正的被执行,然而对于ZipHelper.compress()方法并没有感知,而是返回了没有压缩成功的字符串,从而造成在解压的时候报错。

为什么会引起的堆外内存泄漏

通过最开始的代码我们可以看出,在没有发生异常的情况下,compress()方法是可以正常的关闭流的,所以内存泄露的根源应该是在uncompress()方法,通过跟踪GZIPInputStream的构造函数和close()应该很快就能找到答案。

下面是申请堆外内存和释放堆外内存的过程调用图,可以对比代码参考
堆外内存调用释放流程图

由于篇幅的原因就不将JDK源码注释一同贴上来了,感兴趣的同学可以按图索骥,找到对应的注释。

//java.util.zip.GZIPInputStream.java
public
class GZIPInputStream extends InflaterInputStream {

public GZIPInputStream(InputStream in) throws IOException {
this(in, 512); //调用下面的构造函数
}

public GZIPInputStream(InputStream in, int size) throws IOException {
super(in, new Inflater(true), size); //新建 Inflater 对象
usesDefaultInflater = true;
readHeader(in);
}

public void close() throws IOException {
if (!closed) {
super.close(); //这里的父类是java.util.zip.InflaterInputStream
eos = true;
closed = true;
}
}
}
//java.util.zip.Inflater.java

public
class Inflater {

public Inflater(boolean nowrap) {
zsRef = new ZStreamRef(init(nowrap));
}

/**
* Closes the decompressor and discards any unprocessed input.
* This method should be called when the decompressor is no longer
* being used, but will also be called automatically by the finalize()
* method. Once this method is called, the behavior of the Inflater
* object is undefined.
*/
public void end() {
synchronized (zsRef) {
long addr = zsRef.address();
zsRef.clear();
if (addr != 0) {
end(addr);
buf = null;
}
}
}

// 此处调用了 Native 方法
private native static long init(boolean nowrap);
private native static void end(long addr);
}
//java.util.zip.InflaterInputStream.java

public
class InflaterInputStream extends FilterInputStream {
/**
* Closes this input stream and releases any system resources associated
* with the stream.
* @exception IOException if an I/O error has occurred
*/
public void close() throws IOException {
if (!closed) {
if (usesDefaultInflater)
inf.end();
in.close();
closed = true;
}
}
}

openJDK 中 JVM 关于这个本地方法的实现

JNIEXPORT jlong JNICALL
Java_java_util_zip_Inflater_init(JNIEnv *env, jclass cls, jboolean nowrap)
{
//此处使用 calloc 申请了堆外内存
z_stream *strm = calloc(1, sizeof(z_stream));

if (strm == NULL) {
JNU_ThrowOutOfMemoryError(env, 0);
return jlong_zero;
} else {
const char *msg;
int ret = inflateInit2(strm, nowrap ? -MAX_WBITS : MAX_WBITS);
switch (ret) {
case Z_OK:
return ptr_to_jlong(strm);
case Z_MEM_ERROR:
free(strm);
JNU_ThrowOutOfMemoryError(env, 0);
return jlong_zero;
default:
msg = ((strm->msg != NULL) ? strm->msg :
(ret == Z_VERSION_ERROR) ?
"zlib returned Z_VERSION_ERROR: "
"compile time and runtime zlib implementations differ" :
(ret == Z_STREAM_ERROR) ?
"inflateInit2 returned Z_STREAM_ERROR" :
"unknown error initializing zlib library");
free(strm);
JNU_ThrowInternalError(env, msg);
return jlong_zero;
}
}
}

JNIEXPORT void JNICALL
Java_java_util_zip_Inflater_end(JNIEnv *env, jclass cls, jlong addr)
{
if (inflateEnd(jlong_to_ptr(addr)) == Z_STREAM_ERROR) {
JNU_ThrowInternalError(env, 0);
} else {
free(jlong_to_ptr(addr)); //此处释放堆外内存
}
}

参考