We design a concurrent separation logic for GPGPU, namely GPUCSL, and prove its soundness by using Coq. GPUCSL is based on a CSL proposed by Blom et al., which is for automatic verification of GPGPU kernels, but employs different inference rules because the rules in Blom’s CSL are not standard. For example, Blom’s CSL does not have a frame rule. Our CSL is a simple extension of the original CSL, hence it is more suitable as a basis of advanced properties proposed for other studies on CSLs. Our soundness proof is based on Vafeiadis’ method, which is for a CSL with a fork-join concurrency model. The proof reveals two problems in the Blom’s approach in terms of soundness and extensibility. First, their assumption that thread ID independence of a kernel implies barrier divergence freedom does not hold. Second, it is not easy to extend their proof for other CSLs with a frame rule. Although our CSL covers only a subset of CUDA, our preliminary experiment shows that it is useful and expressive enough to verify a simple kernel with barriers.