Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensor data not read correctly from NPY files #144

Open
Spiess opened this issue Dec 17, 2018 · 9 comments
Open

Tensor data not read correctly from NPY files #144

Spiess opened this issue Dec 17, 2018 · 9 comments

Comments

@Spiess
Copy link

Spiess commented Dec 17, 2018

When I create a NPY tensor file from Python and read it from Scala TensorFlow, the shape of the resulting Tensor is correct, but the content is all zeros.

For example, writing from Python:

import numpy as np

tensor = np.asarray([1.5, -30.4])
np.save('test.npy', tensor)

Reading from Scala TensorFlow:

val tensor = Tensor.fromNPY[Double](Paths.get("test.npy"))

println(tensor.summarize())

Output:

Tensor[Double, [2]]
[0.0, 0.0]

The output from Scala TensorFlow is the same when saving the Python tensor with allow_pickle=False. The tensor can be read without any problems from Python using Numpy.

@eaplatanios
Copy link
Owner

What happens when you save a tensor from TF Scala and try to read it in Numpy? I only tested writing using TF Scala and reading back into TF Scala.

@eaplatanios
Copy link
Owner

@Spiess I actually just tried your example and it worked fine for me with Python 3.7. Are you sure you pulled TF Scala master and not using an earlier version. I haven't released the snapshot binaries yet so you should try the example using the code from the master branch.

@Spiess
Copy link
Author

Spiess commented Dec 19, 2018

Ah, sorry, I was using TF Scala 0.4.1 and Python 2.7, so it's entirely possible it works on master.

@eaplatanios
Copy link
Owner

No problem. In that case, and since I can't reproduce this anymore on master, I'll close this. Feel free to reopen if the issue persists. :)

@davidmweber
Copy link

davidmweber commented Jan 3, 2019

I can confirm that npy files are not being correctly read in 0.4.2-SNAPSHOT either. Attached is
iris_x.npy.gz which reads correctly into Python 3.6 but not into TF Scala 0.4.1 or 0.4.2-SNAPSHOT. These data are the features from the Iris problem set.

I load the (uncompressed) npy file and inspect it as follows:

val x_test = Tensor.fromNPY[Float](Paths.get("iris_x.npy"))
print(x_test.summarize())

This results in

Tensor[Float, [150, 4]]
[[5.1, 3.5, 1.4, 0.2],
 [4.9, 3.0, 1.4, 0.2],
 [4.7, 3.2, 1.3, 0.2],
 ...,
 [0.0, 0.0, 0.0, 0.0],
 [0.0, 0.0, 0.0, 0.0],
 [0.0, 0.0, 0.0, 0.0]]

The last 15 or so rows of the file are incorrectly read in as zeros.

@davidmweber
Copy link

davidmweber commented Jan 3, 2019

I have run the debugger through this. The problem appears in Tensor.fromBuffer. In my example, the last 11 rows (11 * 4 columns * 4 bytes = 176 bytes) are set to zero. I cannot inspect the direct buffer's underlying allocated buffer without cloning the project and tooling up a bit.

    this synchronized {
      // TODO: May behave weirdly for direct byte buffers allocated on the Scala side.
      val directBuffer = {
        if (buffer.isDirect) {
          buffer
        } else {
          val direct = ByteBuffer.allocateDirect(numBytes.toInt)
          val bufferCopy = buffer.duplicate()
          direct.put(bufferCopy.limit(numBytes.toInt).asInstanceOf[ByteBuffer]) // <<< This copy is suspect
          direct
        }
      }

Shout if I can assist more.

@eaplatanios
Copy link
Owner

@davidmweber thanks for testing this and digging a bit into it! I’ll try to resolve it once I’m back from traveling in about 3-4 days and so for now I’ll simply reopen the issue. :)

@eaplatanios eaplatanios reopened this Jan 5, 2019
@davidmweber
Copy link

I have some code that I can repurpose to test ByteBuffer -> tensor and npy -> tensor implementations. the npy -> tensor needs some pre-stored file somewhere but I think it will be a useful test case

@eaplatanios
Copy link
Owner

@davidmweber This would indeed be a useful test case. Could you share that using a PR so I use it to debug this issue? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants