PyTorch, A Quick Intro
Pytorch uses tensors instead, similar to a numpy array
import torch
import torchvision
ts = torch.tensor([[1,2,3],[4,5,6]]) # Tensor
ts.shape
> torch.Size([2,3])
type(ts)
> <class 'torch.Tensor'>
Converting to/from np Arrays from/to Tensors
# Conversion
np_array = np.array([1, 2, 3])
np_to_ts = torch.from_numpy(m_np) #Convert a numpy array to a Tensor
np_array = ts.numpy() #Tensor to numpy
np_array[1] = -1 #Numpy and Tensor share the same memory
np_to_ts[1] = -1
Assigning and splitting works the same as np. Masking
ts = torch.tensor([[1,2,3],[4,5,6]])
mask = ts > 1
new_array = ts[mask]
print(new_array)
> ([2,3,4,5,6])
Same meaning, different syntax. Likely a dunder method for add
x + y = torch.add(x,y) = torch.add(x, y, out=result_add)
x - y = torch.sub(x,y)
x * y = torch.mul(x,y)
Division with ints returns ints (NOT floats like with numpy). Convert to floats first if you want decimal points.
a = torch.tensor([4,4])
b = torch.tensor([3,3])
a/b = ([1,1])
Pytorch has its own dataloader (Thank goodness. Writing one is a pain). It automatically converts everything to pytorch tensors.
from torch.utils.data import DataLoader
pytorch_dataloader = DataLoader(our_csv_dataset, batch_size=batch_size)
# We can use the exact same way to iterate over samples
for i, item in enumerate(pytorch_dataloader):
print('Starting item {}'.format(i))
print('item contains')
for key in item:
print(key)
print(type(item[key]))
print(item[key].shape)
if i+1 >= 1:
break
## Output ##
> Starting item 0
> item contains
features # key
<class 'torch.Tensor'> # type of item
torch.Size([4, 2]) # shape of item
target # key
<class 'torch.Tensor'> # type of item
torch.Size([4, 1]) # shape of item
Torchvision
Torchvision comes with dataloaders for common datasets like Imagenet, FashionMNIST. That way you don’t end up having to write boilerplate code.
-
transforms.Compose
let’s you do a series of transformations in a row. -
transforms.ToTenser
convert PIL image or numpy.ndarray
(H × W × C) in the range 0,255 to a torch.FloatTensor of shape
(C × H × W) in the range 0.0, 1.0. -
transforms.Normalize
normalize a tensor image with mean and standard deviation. -
datasets.FashionMNIST
to download the Fashion MNIST datasets and transform the data. train=True if we want to get the training set; otherwise set train=False to get the test set. -
torch.utils.data.Dataloader
takes our training data or test data with parameter batch_size and shuffle. batch_size defines how many samples per batch to load. shuffle=True makes the data reshuffled at every epoch.
Print an image method
Useful for looking at a select one or few images in the datset.
fashion_mnist_dataloader = DataLoader(fashion_mnist_dataset, batch_size=8)
def imshow(img):
img = img / 2 + 0.5 # unormalize
npimg = img.numpy()
plt.imshow(np.transpose(npimg, (1, 2, 0)))
plt.show()
# get some random training images
dataiter = iter(fashion_mnist_dataloader)
images, labels = dataiter.next()
# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(8)))