鱼C论坛

 找回密码
 立即注册
查看: 2919|回复: 11

[已解决]如何主动结束python程序

[复制链接]
发表于 2022-12-23 09:21:25 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能^_^

您需要 登录 才可以下载或查看,没有账号?立即注册

x
目前我有两个文件分别为a.py和b.py
然后a.py在运行时需要import b执行一些功能。然后a.py是始终保持运行的,而b.py在运行执行完功能后并没有结束,也是保持运行。
也就导致b.py只能运行一次,没法运行第二次。除非结束a.py。我试过在b.py里加上sys.exit(0)。但b.py结束后会把a.py也结束。
所以有没有办法可以使a.py始终保持运行的情况下,b.py运行完一次后就结束程序,然后可以再运行第二次,第三次...即多次运行
最佳答案
2022-12-23 10:22:47
你想在一个程序中执行另一个程序的话最好用 exec,而不应该用 import

eg:
with open('b.py') as fd:
    code = fd.read()
exec(code)

当然,你想用 import 也不是不行,你可以成功 import b 就已经说明 b 已经执行结束了,

你再次 import b 没有效果不是因为 b 没有结束,而是因为 python 的导入机制,

python 在导入一个模块之后,会将模块名和模块对象的对应关系放入 sys.modules 中,

当你导入这个模块的时候会先看 sys.modules 中有没有这个模块,如果有的话就不会再导入,

所以你想能多次导入同一个模块的话得先从 sys.modules 中删掉这个模块,

eg:
import sys
import b
del sys.modules['b']
import b
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复

使用道具 举报

发表于 2022-12-23 09:23:08 | 显示全部楼层

回帖奖励 +3 鱼币

能把b.py的具体代码贴一下么
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2022-12-23 09:26:14 | 显示全部楼层

回帖奖励 +3 鱼币

子函数应该会有return的,你的b里到底做了啥
把代码发全。
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2022-12-23 09:35:11 From FishC Mobile | 显示全部楼层

回帖奖励 +3 鱼币

我觉得你该重新设计你的代码逻辑了,看你发了好几个帖子感觉你的程序为了实现一个简单的逻辑绕了一大圈路
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2022-12-23 10:02:23 | 显示全部楼层
你发一下代码吧,我给你改一改,我觉得你的代码逻辑好像有问题
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2022-12-23 10:22:47 | 显示全部楼层    本楼为最佳答案   
你想在一个程序中执行另一个程序的话最好用 exec,而不应该用 import

eg:
with open('b.py') as fd:
    code = fd.read()
exec(code)

当然,你想用 import 也不是不行,你可以成功 import b 就已经说明 b 已经执行结束了,

你再次 import b 没有效果不是因为 b 没有结束,而是因为 python 的导入机制,

python 在导入一个模块之后,会将模块名和模块对象的对应关系放入 sys.modules 中,

当你导入这个模块的时候会先看 sys.modules 中有没有这个模块,如果有的话就不会再导入,

所以你想能多次导入同一个模块的话得先从 sys.modules 中删掉这个模块,

eg:
import sys
import b
del sys.modules['b']
import b
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2022-12-23 10:42:33 | 显示全部楼层
我就好奇b里到底干了啥,这种运行方式也是不按套路出牌
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2022-12-23 10:49:32 | 显示全部楼层
tommyyu 发表于 2022-12-23 09:23
能把b.py的具体代码贴一下么

import torch
#import cv2
import torch.nn.functional as F
import numpy as np
import json
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import skimage.transform
import argparse
from PIL import Image

torch.cuda.set_device(-1)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)


def caption_image_beam_search(encoder, decoder, image_path, word_map, beam_size=3):
    """
    Reads an image and captions it with beam search.

    :param encoder: encoder model
    :param decoder: decoder model
    :param image_path: path to image
    :param word_map: word map
    :param beam_size: number of sequences to consider at each decode-step
    :return: caption, weights for visualization
    """

    k = beam_size
    vocab_size = len(word_map)

    # Read image and process
    img = Image.open(image_path)
    #result = np.zeros(img.shape, dtype=np.float32)
    #result = img / 255.0
    #img = np.uint8(result * 255.0)
    img = np.array(img)
    width, high, channel = img.shape
    width_new, high_new = (256, 256)

    img = img[width - width_new:, (high - 256) // 2:high - ((high - 256) // 2), :]
    #img = cv2.imread(image_path)
    #img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    # 当为单通道图像时,转化为三通道
    if len(img.shape) == 2:
        img = img[:, :, np.newaxis]  # 增加纬度
        img = np.concatenate([img, img, img], axis=2)  # 拼接为三通道
    #img = img.resize(img, (256, 256), Image.ANTIALIAS)
    img = np.array(img)
    img = img.transpose(2, 0, 1)  # 矩阵转置 通道数放在前面
    img = img / 255.
    img = torch.FloatTensor(img).to(device)
    normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                     std=[0.229, 0.224, 0.225])
    transform = transforms.Compose([normalize])
    image = transform(img)  # (3, 256, 256)

    # Encode
    image = image.unsqueeze(0)  # (1, 3, 256, 256)
    encoder_out = encoder(image)  # (1, enc_image_size, enc_image_size, encoder_dim) 1,14,14,2048
    enc_image_size = encoder_out.size(1)
    print('enc_image_size:', enc_image_size)
    encoder_dim = encoder_out.size(3)
    print('encoder_dim:', encoder_dim)
    # Flatten encoding
    encoder_out = encoder_out.view(1, -1, encoder_dim)  # (1, num_pixels, encoder_dim) 1,196,2048
    # 表示了图像的196个区域各自的特征
    # print('encoder_out:',encoder_out)
    num_pixels = encoder_out.size(1)  # 第二位 196
    # print('num_pixels:',num_pixels)
    # We'll treat the problem as having a batch size of k
    # print(encoder_out.size())
    encoder_out = encoder_out.expand(k, num_pixels, encoder_dim)  # (k, num_pixels, encoder_dim)1->k纬度扩展,五份特征
    # print(encoder_out.size())
    # Tensor to store top k previous words at each step; now they're just <start>
    k_prev_words = torch.LongTensor([[word_map['<start>']]] * k).to(device)  # (k, 1)
    # print('k_prev_words:',k_prev_words)
    # Tensor to store top k sequences; now they're just <start>
    seqs = k_prev_words  # (k, 1)

    # Tensor to store top k sequences' scores; now they're just 0
    top_k_scores = torch.zeros(k, 1).to(device)  # (k, 1)

    # Tensor to store top k sequences' alphas; now they're just 1s 这里其实就是存储每个字对应图像上的关注区域,映射在14*14的张量上面
    seqs_alpha = torch.ones(k, 1, enc_image_size, enc_image_size).to(device)  # (k, 1, enc_image_size, enc_image_size)

    # Lists to store completed sequences, their alphas and scores
    complete_seqs = list()
    complete_seqs_alpha = list()
    complete_seqs_scores = list()

    # Start decoding
    step = 1
    h, c = decoder.init_hidden_state(encoder_out)  # h0
    print('h, c', h.size(), c.size())
    # s is a number less than or equal to k, because sequences are removed from this process once they hit <end>
    while True:

        embeddings = decoder.embedding(k_prev_words).squeeze(1)  # (s, embed_dim) (5,隐层512)
        print('embeddings', embeddings.size())
        # encode的图片表示 和  隐状态
        awe, alpha = decoder.attention(encoder_out,
                                       h)  # (s, encoder_dim), (s, num_pixels)(5,2048(),5,196(attention 存储字对应图像各部分的权重))
        print(' awe, alpha', awe.size(), alpha.size())
        # 0/0
        alpha = alpha.view(-1, enc_image_size, enc_image_size)  # (s, enc_image_size, enc_image_size)(5,14,14)

        gate = decoder.sigmoid(decoder.f_beta(h))  # gating scalar, (s, encoder_dim)

        awe = gate * awe  # 给特征赋予权重

        h, c = decoder.decode_step(torch.cat([embeddings, awe], dim=1),
                                   (h, c))  # (s, decoder_dim)输入(512,2048),(512,512)带权重的特征和上一次的lstm输出和细胞状态值

        scores = decoder.fc(h)  # (s, vocab_size)

        scores = F.log_softmax(scores, dim=1)
        print('scores', scores.size())
        # Add 每一句 含有多少词 更新
        scores = top_k_scores.expand_as(scores) + scores  # (s, vocab_size)
        print('top_k_scores,scores', top_k_scores.size(), scores.size())
        # For the first step, all k points will have the same scores (since same k previous words, h, c)
        if step == 1:
            top_k_scores, top_k_words = scores[0].topk(k, 0, True, True)  # (s)
        else:
            # Unroll and find top scores, and their unrolled indices
            top_k_scores, top_k_words = scores.view(-1).topk(k, 0, True, True)  # (s) 取词,top
        print('top_k_scores,top_k_words', top_k_scores.size(), top_k_words.size())
        # Convert unrolled indices to actual indices of scores
        prev_word_inds = torch.floor_divide(top_k_words, vocab_size)
        # prev_word_inds = top_k_words / vocab_size  # (s)
        next_word_inds = top_k_words % vocab_size  # (s)
        print('top_k_scores,top_k_words,prev_word_inds,next_word_inds', top_k_words, top_k_scores, prev_word_inds,
              next_word_inds)
        # Add new words to sequences, alphas
        seqs = torch.cat([seqs[prev_word_inds], next_word_inds.unsqueeze(1)], dim=1)  # (s, step+1)#词加一
        seqs_alpha = torch.cat([seqs_alpha[prev_word_inds], alpha[prev_word_inds].unsqueeze(1)],  # 词对应图像区域加一
                               dim=1)  # (s, step+1, enc_image_size, enc_image_size)

        # Which sequences are incomplete (didn't reach <end>)? 挑出这次循环完结的 句子
        incomplete_inds = [ind for ind, next_word in enumerate(next_word_inds) if
                           next_word != word_map['<end>']]
        complete_inds = list(set(range(len(next_word_inds))) - set(incomplete_inds))

        # Set aside complete sequences 挑出完整序列
        if len(complete_inds) > 0:
            complete_seqs.extend(seqs[complete_inds].tolist())  # 追加全部序列
            complete_seqs_alpha.extend(seqs_alpha[complete_inds].tolist())
            complete_seqs_scores.extend(top_k_scores[complete_inds])
        k -= len(complete_inds)  # reduce beam length accordingly

        # Proceed with incomplete sequences
        if k == 0:
            break
        # 更新参数 只保留未完全序列参数
        seqs = seqs[incomplete_inds]
        seqs_alpha = seqs_alpha[incomplete_inds]
        h = h[prev_word_inds[incomplete_inds]]
        c = c[prev_word_inds[incomplete_inds]]
        encoder_out = encoder_out[prev_word_inds[incomplete_inds]]
        top_k_scores = top_k_scores[incomplete_inds].unsqueeze(1)
        k_prev_words = next_word_inds[incomplete_inds].unsqueeze(1)

        # Break if things have been going on too long
        if step > 50:
            break
        step += 1
    # 标记 scores分数最高序列作为返回值。
    i = complete_seqs_scores.index(max(complete_seqs_scores))
    seq = complete_seqs[i]
    alphas = complete_seqs_alpha[i]

    return seq, alphas


def visualize_att(image_path, seq, alphas, rev_word_map, smooth=True):
    """
    Visualizes caption with weights at every word.

    Adapted from paper authors' repo: https://github.com/kelvinxu/arct ... visualization.ipynb

    :param image_path: path to image that has been captioned
    :param seq: caption
    :param alphas: weights
    :param rev_word_map: reverse word mapping, i.e. ix2word
    :param smooth: smooth weights?
    """
    image = Image.open(image_path)
    image = image.resize([14 * 12, 14 * 12], Image.LANCZOS)

    words = [rev_word_map[ind] for ind in seq]
    #print(words)
    for t in range(1,len(words)-1):
        if t > 50:
            break
        plt.subplot(int(np.ceil(len(words)) / 5.), 6, t)

        plt.text(0, 1, '%s' % (words[t]), color='black', backgroundcolor='white', fontsize=12)
        plt.imshow(image)
        current_alpha = alphas[t, :]
        if smooth:
            alpha = skimage.transform.pyramid_expand(current_alpha.numpy(), upscale=12, sigma=8)
        else:
            alpha = skimage.transform.resize(current_alpha.numpy(), [14 * 12, 14 * 12])
        if t == 0:
            plt.imshow(alpha, alpha=0)
        else:
            plt.imshow(alpha, alpha=0.8)
        plt.set_cmap(cm.Greys_r)
        plt.axis('off')
    #plt.show()


import scipy

print(scipy.__version__)
#checkpoint = torch.load('./BEST_checkpoint_coco_5_cap_per_img_5_min_word_freq.pth.tar', map_location=str(device))
checkpoint = torch.load(r'C:/Users/Ternence/PycharmProjects/pythonProject/tuxiang/BEST_checkpoint_flickr8k_5_cap_per_img_5_min_word_freq.pth.tar', map_location=str(device))
decoder = checkpoint['decoder']
decoder = decoder.to(device)
decoder.eval()
encoder = checkpoint['encoder']
encoder = encoder.to(device)
encoder.eval()

# Load word map (word2ix)
with open(r'C:/Users/Ternence/kind2/Flickr8k/data/WORDMAP_flickr8k_5_cap_per_img_5_min_word_freq.json', 'r') as j:
    word_map = json.load(j,)
rev_word_map = {v: k for k, v in word_map.items()}  # ix2word
# Encode, decode with attention and beam search
seq, alphas = caption_image_beam_search(encoder, decoder, r'C:/Users/Ternence/kind2/Flickr8k/tupian/ren2.jpg', word_map, 5)
alphas = torch.FloatTensor(alphas)

# Visualize caption and attention of best sequence
visualize_att(r'C:/Users/Ternence/kind2/Flickr8k/tupian/ren2.jpg', seq, alphas, rev_word_map, True)
words = [rev_word_map[ind] for ind in seq]
print(words)

这个是图像字幕的可视化功能
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2022-12-23 10:50:17 | 显示全部楼层
编程追风梦 发表于 2022-12-23 10:02
你发一下代码吧,我给你改一改,我觉得你的代码逻辑好像有问题

import torch
#import cv2
import torch.nn.functional as F
import numpy as np
import json
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import skimage.transform
import argparse
from PIL import Image

torch.cuda.set_device(-1)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)


def caption_image_beam_search(encoder, decoder, image_path, word_map, beam_size=3):
    """
    Reads an image and captions it with beam search.

    :param encoder: encoder model
    :param decoder: decoder model
    :param image_path: path to image
    :param word_map: word map
    :param beam_size: number of sequences to consider at each decode-step
    :return: caption, weights for visualization
    """

    k = beam_size
    vocab_size = len(word_map)

    # Read image and process
    img = Image.open(image_path)
    #result = np.zeros(img.shape, dtype=np.float32)
    #result = img / 255.0
    #img = np.uint8(result * 255.0)
    img = np.array(img)
    width, high, channel = img.shape
    width_new, high_new = (256, 256)

    img = img[width - width_new:, (high - 256) // 2:high - ((high - 256) // 2), :]
    #img = cv2.imread(image_path)
    #img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    # 当为单通道图像时,转化为三通道
    if len(img.shape) == 2:
        img = img[:, :, np.newaxis]  # 增加纬度
        img = np.concatenate([img, img, img], axis=2)  # 拼接为三通道
    #img = img.resize(img, (256, 256), Image.ANTIALIAS)
    img = np.array(img)
    img = img.transpose(2, 0, 1)  # 矩阵转置 通道数放在前面
    img = img / 255.
    img = torch.FloatTensor(img).to(device)
    normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                     std=[0.229, 0.224, 0.225])
    transform = transforms.Compose([normalize])
    image = transform(img)  # (3, 256, 256)

    # Encode
    image = image.unsqueeze(0)  # (1, 3, 256, 256)
    encoder_out = encoder(image)  # (1, enc_image_size, enc_image_size, encoder_dim) 1,14,14,2048
    enc_image_size = encoder_out.size(1)
    print('enc_image_size:', enc_image_size)
    encoder_dim = encoder_out.size(3)
    print('encoder_dim:', encoder_dim)
    # Flatten encoding
    encoder_out = encoder_out.view(1, -1, encoder_dim)  # (1, num_pixels, encoder_dim) 1,196,2048
    # 表示了图像的196个区域各自的特征
    # print('encoder_out:',encoder_out)
    num_pixels = encoder_out.size(1)  # 第二位 196
    # print('num_pixels:',num_pixels)
    # We'll treat the problem as having a batch size of k
    # print(encoder_out.size())
    encoder_out = encoder_out.expand(k, num_pixels, encoder_dim)  # (k, num_pixels, encoder_dim)1->k纬度扩展,五份特征
    # print(encoder_out.size())
    # Tensor to store top k previous words at each step; now they're just <start>
    k_prev_words = torch.LongTensor([[word_map['<start>']]] * k).to(device)  # (k, 1)
    # print('k_prev_words:',k_prev_words)
    # Tensor to store top k sequences; now they're just <start>
    seqs = k_prev_words  # (k, 1)

    # Tensor to store top k sequences' scores; now they're just 0
    top_k_scores = torch.zeros(k, 1).to(device)  # (k, 1)

    # Tensor to store top k sequences' alphas; now they're just 1s 这里其实就是存储每个字对应图像上的关注区域,映射在14*14的张量上面
    seqs_alpha = torch.ones(k, 1, enc_image_size, enc_image_size).to(device)  # (k, 1, enc_image_size, enc_image_size)

    # Lists to store completed sequences, their alphas and scores
    complete_seqs = list()
    complete_seqs_alpha = list()
    complete_seqs_scores = list()

    # Start decoding
    step = 1
    h, c = decoder.init_hidden_state(encoder_out)  # h0
    print('h, c', h.size(), c.size())
    # s is a number less than or equal to k, because sequences are removed from this process once they hit <end>
    while True:

        embeddings = decoder.embedding(k_prev_words).squeeze(1)  # (s, embed_dim) (5,隐层512)
        print('embeddings', embeddings.size())
        # encode的图片表示 和  隐状态
        awe, alpha = decoder.attention(encoder_out,
                                       h)  # (s, encoder_dim), (s, num_pixels)(5,2048(),5,196(attention 存储字对应图像各部分的权重))
        print(' awe, alpha', awe.size(), alpha.size())
        # 0/0
        alpha = alpha.view(-1, enc_image_size, enc_image_size)  # (s, enc_image_size, enc_image_size)(5,14,14)

        gate = decoder.sigmoid(decoder.f_beta(h))  # gating scalar, (s, encoder_dim)

        awe = gate * awe  # 给特征赋予权重

        h, c = decoder.decode_step(torch.cat([embeddings, awe], dim=1),
                                   (h, c))  # (s, decoder_dim)输入(512,2048),(512,512)带权重的特征和上一次的lstm输出和细胞状态值

        scores = decoder.fc(h)  # (s, vocab_size)

        scores = F.log_softmax(scores, dim=1)
        print('scores', scores.size())
        # Add 每一句 含有多少词 更新
        scores = top_k_scores.expand_as(scores) + scores  # (s, vocab_size)
        print('top_k_scores,scores', top_k_scores.size(), scores.size())
        # For the first step, all k points will have the same scores (since same k previous words, h, c)
        if step == 1:
            top_k_scores, top_k_words = scores[0].topk(k, 0, True, True)  # (s)
        else:
            # Unroll and find top scores, and their unrolled indices
            top_k_scores, top_k_words = scores.view(-1).topk(k, 0, True, True)  # (s) 取词,top
        print('top_k_scores,top_k_words', top_k_scores.size(), top_k_words.size())
        # Convert unrolled indices to actual indices of scores
        prev_word_inds = torch.floor_divide(top_k_words, vocab_size)
        # prev_word_inds = top_k_words / vocab_size  # (s)
        next_word_inds = top_k_words % vocab_size  # (s)
        print('top_k_scores,top_k_words,prev_word_inds,next_word_inds', top_k_words, top_k_scores, prev_word_inds,
              next_word_inds)
        # Add new words to sequences, alphas
        seqs = torch.cat([seqs[prev_word_inds], next_word_inds.unsqueeze(1)], dim=1)  # (s, step+1)#词加一
        seqs_alpha = torch.cat([seqs_alpha[prev_word_inds], alpha[prev_word_inds].unsqueeze(1)],  # 词对应图像区域加一
                               dim=1)  # (s, step+1, enc_image_size, enc_image_size)

        # Which sequences are incomplete (didn't reach <end>)? 挑出这次循环完结的 句子
        incomplete_inds = [ind for ind, next_word in enumerate(next_word_inds) if
                           next_word != word_map['<end>']]
        complete_inds = list(set(range(len(next_word_inds))) - set(incomplete_inds))

        # Set aside complete sequences 挑出完整序列
        if len(complete_inds) > 0:
            complete_seqs.extend(seqs[complete_inds].tolist())  # 追加全部序列
            complete_seqs_alpha.extend(seqs_alpha[complete_inds].tolist())
            complete_seqs_scores.extend(top_k_scores[complete_inds])
        k -= len(complete_inds)  # reduce beam length accordingly

        # Proceed with incomplete sequences
        if k == 0:
            break
        # 更新参数 只保留未完全序列参数
        seqs = seqs[incomplete_inds]
        seqs_alpha = seqs_alpha[incomplete_inds]
        h = h[prev_word_inds[incomplete_inds]]
        c = c[prev_word_inds[incomplete_inds]]
        encoder_out = encoder_out[prev_word_inds[incomplete_inds]]
        top_k_scores = top_k_scores[incomplete_inds].unsqueeze(1)
        k_prev_words = next_word_inds[incomplete_inds].unsqueeze(1)

        # Break if things have been going on too long
        if step > 50:
            break
        step += 1
    # 标记 scores分数最高序列作为返回值。
    i = complete_seqs_scores.index(max(complete_seqs_scores))
    seq = complete_seqs[i]
    alphas = complete_seqs_alpha[i]

    return seq, alphas


def visualize_att(image_path, seq, alphas, rev_word_map, smooth=True):
    """
    Visualizes caption with weights at every word.

    Adapted from paper authors' repo: https://github.com/kelvinxu/arct ... visualization.ipynb

    :param image_path: path to image that has been captioned
    :param seq: caption
    :param alphas: weights
    :param rev_word_map: reverse word mapping, i.e. ix2word
    :param smooth: smooth weights?
    """
    image = Image.open(image_path)
    image = image.resize([14 * 12, 14 * 12], Image.LANCZOS)

    words = [rev_word_map[ind] for ind in seq]
    #print(words)
    for t in range(1,len(words)-1):
        if t > 50:
            break
        plt.subplot(int(np.ceil(len(words)) / 5.), 6, t)

        plt.text(0, 1, '%s' % (words[t]), color='black', backgroundcolor='white', fontsize=12)
        plt.imshow(image)
        current_alpha = alphas[t, :]
        if smooth:
            alpha = skimage.transform.pyramid_expand(current_alpha.numpy(), upscale=12, sigma=8)
        else:
            alpha = skimage.transform.resize(current_alpha.numpy(), [14 * 12, 14 * 12])
        if t == 0:
            plt.imshow(alpha, alpha=0)
        else:
            plt.imshow(alpha, alpha=0.8)
        plt.set_cmap(cm.Greys_r)
        plt.axis('off')
    #plt.show()


import scipy

print(scipy.__version__)
#checkpoint = torch.load('./BEST_checkpoint_coco_5_cap_per_img_5_min_word_freq.pth.tar', map_location=str(device))
checkpoint = torch.load(r'C:/Users/Ternence/PycharmProjects/pythonProject/tuxiang/BEST_checkpoint_flickr8k_5_cap_per_img_5_min_word_freq.pth.tar', map_location=str(device))
decoder = checkpoint['decoder']
decoder = decoder.to(device)
decoder.eval()
encoder = checkpoint['encoder']
encoder = encoder.to(device)
encoder.eval()

# Load word map (word2ix)
with open(r'C:/Users/Ternence/kind2/Flickr8k/data/WORDMAP_flickr8k_5_cap_per_img_5_min_word_freq.json', 'r') as j:
    word_map = json.load(j,)
rev_word_map = {v: k for k, v in word_map.items()}  # ix2word
# Encode, decode with attention and beam search
seq, alphas = caption_image_beam_search(encoder, decoder, r'C:/Users/Ternence/kind2/Flickr8k/tupian/ren2.jpg', word_map, 5)
alphas = torch.FloatTensor(alphas)

# Visualize caption and attention of best sequence
visualize_att(r'C:/Users/Ternence/kind2/Flickr8k/tupian/ren2.jpg', seq, alphas, rev_word_map, True)
words = [rev_word_map[ind] for ind in seq]
print(words)

这个是图像字幕的可视化功能,就是b.py
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2022-12-23 10:51:09 | 显示全部楼层
莫凡辰 发表于 2022-12-23 10:49
import torch
#import cv2
import torch.nn.functional as F

我感觉这两个函数都可以定义到 a.py 里面,把下面的那一部分也可以封装成一个函数,全都放到 a.py 里面
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2022-12-23 10:51:19 | 显示全部楼层
suchocolate 发表于 2022-12-23 09:26
子函数应该会有return的,你的b里到底做了啥
把代码发全。

import torch
#import cv2
import torch.nn.functional as F
import numpy as np
import json
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import skimage.transform
import argparse
from PIL import Image

torch.cuda.set_device(-1)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)


def caption_image_beam_search(encoder, decoder, image_path, word_map, beam_size=3):
    """
    Reads an image and captions it with beam search.

    :param encoder: encoder model
    :param decoder: decoder model
    :param image_path: path to image
    :param word_map: word map
    :param beam_size: number of sequences to consider at each decode-step
    :return: caption, weights for visualization
    """

    k = beam_size
    vocab_size = len(word_map)

    # Read image and process
    img = Image.open(image_path)
    #result = np.zeros(img.shape, dtype=np.float32)
    #result = img / 255.0
    #img = np.uint8(result * 255.0)
    img = np.array(img)
    width, high, channel = img.shape
    width_new, high_new = (256, 256)

    img = img[width - width_new:, (high - 256) // 2:high - ((high - 256) // 2), :]
    #img = cv2.imread(image_path)
    #img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    # 当为单通道图像时,转化为三通道
    if len(img.shape) == 2:
        img = img[:, :, np.newaxis]  # 增加纬度
        img = np.concatenate([img, img, img], axis=2)  # 拼接为三通道
    #img = img.resize(img, (256, 256), Image.ANTIALIAS)
    img = np.array(img)
    img = img.transpose(2, 0, 1)  # 矩阵转置 通道数放在前面
    img = img / 255.
    img = torch.FloatTensor(img).to(device)
    normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                     std=[0.229, 0.224, 0.225])
    transform = transforms.Compose([normalize])
    image = transform(img)  # (3, 256, 256)

    # Encode
    image = image.unsqueeze(0)  # (1, 3, 256, 256)
    encoder_out = encoder(image)  # (1, enc_image_size, enc_image_size, encoder_dim) 1,14,14,2048
    enc_image_size = encoder_out.size(1)
    print('enc_image_size:', enc_image_size)
    encoder_dim = encoder_out.size(3)
    print('encoder_dim:', encoder_dim)
    # Flatten encoding
    encoder_out = encoder_out.view(1, -1, encoder_dim)  # (1, num_pixels, encoder_dim) 1,196,2048
    # 表示了图像的196个区域各自的特征
    # print('encoder_out:',encoder_out)
    num_pixels = encoder_out.size(1)  # 第二位 196
    # print('num_pixels:',num_pixels)
    # We'll treat the problem as having a batch size of k
    # print(encoder_out.size())
    encoder_out = encoder_out.expand(k, num_pixels, encoder_dim)  # (k, num_pixels, encoder_dim)1->k纬度扩展,五份特征
    # print(encoder_out.size())
    # Tensor to store top k previous words at each step; now they're just <start>
    k_prev_words = torch.LongTensor([[word_map['<start>']]] * k).to(device)  # (k, 1)
    # print('k_prev_words:',k_prev_words)
    # Tensor to store top k sequences; now they're just <start>
    seqs = k_prev_words  # (k, 1)

    # Tensor to store top k sequences' scores; now they're just 0
    top_k_scores = torch.zeros(k, 1).to(device)  # (k, 1)

    # Tensor to store top k sequences' alphas; now they're just 1s 这里其实就是存储每个字对应图像上的关注区域,映射在14*14的张量上面
    seqs_alpha = torch.ones(k, 1, enc_image_size, enc_image_size).to(device)  # (k, 1, enc_image_size, enc_image_size)

    # Lists to store completed sequences, their alphas and scores
    complete_seqs = list()
    complete_seqs_alpha = list()
    complete_seqs_scores = list()

    # Start decoding
    step = 1
    h, c = decoder.init_hidden_state(encoder_out)  # h0
    print('h, c', h.size(), c.size())
    # s is a number less than or equal to k, because sequences are removed from this process once they hit <end>
    while True:

        embeddings = decoder.embedding(k_prev_words).squeeze(1)  # (s, embed_dim) (5,隐层512)
        print('embeddings', embeddings.size())
        # encode的图片表示 和  隐状态
        awe, alpha = decoder.attention(encoder_out,
                                       h)  # (s, encoder_dim), (s, num_pixels)(5,2048(),5,196(attention 存储字对应图像各部分的权重))
        print(' awe, alpha', awe.size(), alpha.size())
        # 0/0
        alpha = alpha.view(-1, enc_image_size, enc_image_size)  # (s, enc_image_size, enc_image_size)(5,14,14)

        gate = decoder.sigmoid(decoder.f_beta(h))  # gating scalar, (s, encoder_dim)

        awe = gate * awe  # 给特征赋予权重

        h, c = decoder.decode_step(torch.cat([embeddings, awe], dim=1),
                                   (h, c))  # (s, decoder_dim)输入(512,2048),(512,512)带权重的特征和上一次的lstm输出和细胞状态值

        scores = decoder.fc(h)  # (s, vocab_size)

        scores = F.log_softmax(scores, dim=1)
        print('scores', scores.size())
        # Add 每一句 含有多少词 更新
        scores = top_k_scores.expand_as(scores) + scores  # (s, vocab_size)
        print('top_k_scores,scores', top_k_scores.size(), scores.size())
        # For the first step, all k points will have the same scores (since same k previous words, h, c)
        if step == 1:
            top_k_scores, top_k_words = scores[0].topk(k, 0, True, True)  # (s)
        else:
            # Unroll and find top scores, and their unrolled indices
            top_k_scores, top_k_words = scores.view(-1).topk(k, 0, True, True)  # (s) 取词,top
        print('top_k_scores,top_k_words', top_k_scores.size(), top_k_words.size())
        # Convert unrolled indices to actual indices of scores
        prev_word_inds = torch.floor_divide(top_k_words, vocab_size)
        # prev_word_inds = top_k_words / vocab_size  # (s)
        next_word_inds = top_k_words % vocab_size  # (s)
        print('top_k_scores,top_k_words,prev_word_inds,next_word_inds', top_k_words, top_k_scores, prev_word_inds,
              next_word_inds)
        # Add new words to sequences, alphas
        seqs = torch.cat([seqs[prev_word_inds], next_word_inds.unsqueeze(1)], dim=1)  # (s, step+1)#词加一
        seqs_alpha = torch.cat([seqs_alpha[prev_word_inds], alpha[prev_word_inds].unsqueeze(1)],  # 词对应图像区域加一
                               dim=1)  # (s, step+1, enc_image_size, enc_image_size)

        # Which sequences are incomplete (didn't reach <end>)? 挑出这次循环完结的 句子
        incomplete_inds = [ind for ind, next_word in enumerate(next_word_inds) if
                           next_word != word_map['<end>']]
        complete_inds = list(set(range(len(next_word_inds))) - set(incomplete_inds))

        # Set aside complete sequences 挑出完整序列
        if len(complete_inds) > 0:
            complete_seqs.extend(seqs[complete_inds].tolist())  # 追加全部序列
            complete_seqs_alpha.extend(seqs_alpha[complete_inds].tolist())
            complete_seqs_scores.extend(top_k_scores[complete_inds])
        k -= len(complete_inds)  # reduce beam length accordingly

        # Proceed with incomplete sequences
        if k == 0:
            break
        # 更新参数 只保留未完全序列参数
        seqs = seqs[incomplete_inds]
        seqs_alpha = seqs_alpha[incomplete_inds]
        h = h[prev_word_inds[incomplete_inds]]
        c = c[prev_word_inds[incomplete_inds]]
        encoder_out = encoder_out[prev_word_inds[incomplete_inds]]
        top_k_scores = top_k_scores[incomplete_inds].unsqueeze(1)
        k_prev_words = next_word_inds[incomplete_inds].unsqueeze(1)

        # Break if things have been going on too long
        if step > 50:
            break
        step += 1
    # 标记 scores分数最高序列作为返回值。
    i = complete_seqs_scores.index(max(complete_seqs_scores))
    seq = complete_seqs[i]
    alphas = complete_seqs_alpha[i]

    return seq, alphas


def visualize_att(image_path, seq, alphas, rev_word_map, smooth=True):
    """
    Visualizes caption with weights at every word.

    Adapted from paper authors' repo: https://github.com/kelvinxu/arct ... visualization.ipynb

    :param image_path: path to image that has been captioned
    :param seq: caption
    :param alphas: weights
    :param rev_word_map: reverse word mapping, i.e. ix2word
    :param smooth: smooth weights?
    """
    image = Image.open(image_path)
    image = image.resize([14 * 12, 14 * 12], Image.LANCZOS)

    words = [rev_word_map[ind] for ind in seq]
    #print(words)
    for t in range(1,len(words)-1):
        if t > 50:
            break
        plt.subplot(int(np.ceil(len(words)) / 5.), 6, t)

        plt.text(0, 1, '%s' % (words[t]), color='black', backgroundcolor='white', fontsize=12)
        plt.imshow(image)
        current_alpha = alphas[t, :]
        if smooth:
            alpha = skimage.transform.pyramid_expand(current_alpha.numpy(), upscale=12, sigma=8)
        else:
            alpha = skimage.transform.resize(current_alpha.numpy(), [14 * 12, 14 * 12])
        if t == 0:
            plt.imshow(alpha, alpha=0)
        else:
            plt.imshow(alpha, alpha=0.8)
        plt.set_cmap(cm.Greys_r)
        plt.axis('off')
    #plt.show()


import scipy

print(scipy.__version__)
#checkpoint = torch.load('./BEST_checkpoint_coco_5_cap_per_img_5_min_word_freq.pth.tar', map_location=str(device))
checkpoint = torch.load(r'C:/Users/Ternence/PycharmProjects/pythonProject/tuxiang/BEST_checkpoint_flickr8k_5_cap_per_img_5_min_word_freq.pth.tar', map_location=str(device))
decoder = checkpoint['decoder']
decoder = decoder.to(device)
decoder.eval()
encoder = checkpoint['encoder']
encoder = encoder.to(device)
encoder.eval()

# Load word map (word2ix)
with open(r'C:/Users/Ternence/kind2/Flickr8k/data/WORDMAP_flickr8k_5_cap_per_img_5_min_word_freq.json', 'r') as j:
    word_map = json.load(j,)
rev_word_map = {v: k for k, v in word_map.items()}  # ix2word
# Encode, decode with attention and beam search
seq, alphas = caption_image_beam_search(encoder, decoder, r'C:/Users/Ternence/kind2/Flickr8k/tupian/ren2.jpg', word_map, 5)
alphas = torch.FloatTensor(alphas)

# Visualize caption and attention of best sequence
visualize_att(r'C:/Users/Ternence/kind2/Flickr8k/tupian/ren2.jpg', seq, alphas, rev_word_map, True)
words = [rev_word_map[ind] for ind in seq]
print(words)

这个是b.py。是图像字幕的可视化功能
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2022-12-23 12:25:59 | 显示全部楼层
isdkz 发表于 2022-12-23 10:22
你想在一个程序中执行另一个程序的话最好用 exec,而不应该用 import

eg:

问题解决了,多谢大佬
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

小黑屋|手机版|Archiver|鱼C工作室 ( 粤ICP备18085999号-1 | 粤公网安备 44051102000585号)

GMT+8, 2025-1-9 02:13

Powered by Discuz! X3.4

© 2001-2023 Discuz! Team.

快速回复 返回顶部 返回列表