组件

modules 用于构建神经网络模型,可以和 MindSpore 一起使用。 modules 具有三大功能模块:Embedding, Encoder-DecoderAttention 。下面我们会分三节分别介绍这三大功能。

Embedding

embedding本质上是一种词嵌入技术,能够将一个单词或短语表示为低维向量.mindnlp提供了一个快速通过预训练glove,fasttext,word2vec词向量简单构造embedding的方法.你也可以创建自己的embedding.

下面将演示如何使用glove预训练词向量来通过MindNLP快速构造embedding.

import numpy as np
from mindspore import Tensor
from mindspore.dataset.text.utils import Vocab
from mindnlp.modules.embeddings.glove_embedding import Glove

# Define your own vocab
vocab = Vocab.from_list(['default', 'one', 'two', 'three'])

# Define your own embedding table
init_embed = Tensor(np.zeros((4, 4)).astype(np.float32))

# Create your own embedding object
glove_embed = Glove(vocab, init_embed)

# You can also use pre-trained word vectors
glove_embed_pretrained, _ = Glove.from_pretrained()

在创建embedding后,我们将使用它进行lookup:

# The index to query for
ids = Tensor([1, 2, 3])

# Computed by the built embedding
output = glove_embed(ids)

你可以从 MindNLP.modules.embeddings.得到更多有关embedding API的信息.

Encoder-Decoder

Encoder-Decoder是一个模型架构,是一类算法统称。在这个框架下可以使用不同的算法来解决不同的人物。Encoder将输入序列转化为语义向量,Decoder根据Encoder的输出生成目标译文。

我们可以使用MindNLP中提供的的Encoder、Decoder模块来构建网络,如下面机器翻译模型的示例所示。关于此模型的更多信息可查看 机器翻译

from mindspore import nn
from mindnlp.abc import Seq2seqModel
from mindnlp.modules import RNNEncoder, RNNDecoder

class MachineTranslation(Seq2seqModel):
    def __init__(self, encoder, decoder):
        super().__init__(encoder, decoder)
        self.encoder = encoder
        self.decoder = decoder

    def construct(self, en, de):
        encoder_out = self.encoder(en)
        decoder_out = self.decoder(de, encoder_out=encoder_out)
        output = decoder_out[0]
        return output.swapaxes(1,2)

enc_emb_dim = 256
dec_emb_dim = 256
enc_hid_dim = 512
dec_hid_dim = 512
enc_dropout = 0.5
dec_dropout = 0.5

# encoder
en_embedding = nn.Embedding(input_dim, enc_emb_dim)
en_rnn = nn.RNN(enc_emb_dim, hidden_size=enc_hid_dim, num_layers=2, has_bias=True,
                batch_first=True, dropout=enc_dropout, bidirectional=False)
rnn_encoder = RNNEncoder(en_embedding, en_rnn)

# decoder
de_embedding = nn.Embedding(output_dim, dec_emb_dim)
input_feed_size = 0 if enc_hid_dim == 0 else dec_hid_dim
rnns = [
    nn.RNNCell(
        input_size=dec_emb_dim + input_feed_size
        if layer == 0
            else dec_hid_dim,
        hidden_size=dec_hid_dim
        )
        for layer in range(2)
]
rnn_decoder = RNNDecoder(de_embedding, rnns, dropout_in=enc_dropout, dropout_out = dec_dropout,attention=True, encoder_output_units=enc_hid_dim)

MindNLP中包含的Encoder-Decoder模块如下表所示。您可以点击具体的名称查看详细的API,也可以通过 MindNLP.modules.encoderMindNLP.modules.decoder 进行了解。

名称

介绍

CNNEncoder

由传入参数convolutions组成的卷积编码器

RNNEncoder

循环神经网络(RNN)编码器

RNNDecoder

循环神经网络(RNN)解码器

Attention

注意力是对人脑注意力机制的模拟。当人们看到一些东西时,他们往往会把注意力集中在重要的信息和忽略其他信息。在自然语言处理中,注意力是为了对文本分配模型注意力的权重,而注意力的本质是就是要从关注所有变为更好地关注局部。在MindNLP中,我们提供各种模块用于注意力机制,可以让您更快的使用它

接下来我们将演示如何通过MindNLP建立一个多头的注意力模块

import mindspore
import mindspore.numpy as np
from mindspore import ops
from mindspore import Tensor
from mindspore.text.modules.attentions import MutiHeadAttention
# initialize random number seeds
standard_normal = ops.StandardNormal(seed=0)

# query is [batch_size, seq_len_q, hidden_size]
q = standard_normal((2, 32, 512))

# key is [batch_size, seq_len_k, hidden_size]
k = standard_normal((2, 20, 512))

# value is [batch_size, seq_len_k, hidden_size]
v = standard_normal((2, 20, 512))

# now query shape is (2, 32 ,512)->(2, 8, 32, 64)
# and key shape is (2, 20 ,512)->(2, 8, 20, 64)
# query * key.transpose(-1, -2):
# (2, 8, 32, 64) * (2, 8, 64, 20) ->(2, 8, 32, 20)
# equal with mask shape that is [batch_size, seq_len_q, seq_len_k]
mask_shape = (2, 32, 20)
mask = Tensor(np.ones(mask_shape), mindspore.bool_)

# use additive attention
net = MutiHeadAttention(heads=8, attention_mode="add")
# you can also use cosine attention via multi-head attention
net = MutiHeadAttention(heads=8, attention_mode="cos")
# you can also use dot-product attention via multi-head attention
# default dot-product attention mode
net = MutiHeadAttention(heads=8)

# x is the output of multi-head attention
# attn is the attention score
x, attn = net(query, key, value, mask)

当然,你也可以使用最基本的缩放点积注意机制来构建模块:

import mindspore
from mindspore import Tensor
from mindspore.text.modules.attentions import ScaledDotAttention
model = ScaledDotAttention(dropout=0.9)
# You can customize the query, key, vlaue vector
q = Tensor(np.ones((2, 32, 512)), mindspore.float32)
k = Tensor(np.ones((2, 20, 512)), mindspore.float32)
v = Tensor(np.ones((2, 20, 400)), mindspore.float32)
output, att = model(q, k, v)
# output shape is (2, 1024, 512)
# att shape is (2, 1024, 32)

目前,MindNlp已经实现了8种注意力机制。关于注意力机制更详细的实现和信息您可以查看我们的文档 MindNLP.modules.attentions .